Re: [HACKERS] Auto creation of Partitions

2007-03-12 Thread Robert Treat
On Saturday 10 March 2007 00:13, NikhilS wrote:
 Hi,

   Given that Simon wants to do away with having the master table APPENDed
 
  in
 
   the planning phase, this would be better.
 
  ISTM you're trading appending the master table for appending the DUMP
  partition, which afaict would give you no gain.

 If there are entries in the master table, I think it would get appended for
 all queries regardless of whether we need to examine its contents or not.
 Segregating dump data into a partition will avoid that.

 I have seen examples in some other databases wherein a partition specifies
 a range of someval - MAXINT for instance, to catch such cases.

 That again means that the onus is on the partition creator most of the
 times..

*shrug*...   we can do that now in pgsql

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread Zeugswetter Andreas ADI SD

  Since partition is inheritance-based, a simple DROP or  NO 
 INHERIT 
  will do the job to deal with the partition. Do we want to reinvent 
  additional syntax when these are around and are documented?
 
 Well, if the syntax for adding a new partition eventually 
 ends up as ALTER TABLE ADD PARTITION, then it would make more 
 sense that you remove a partition via ALTER TABLE DROP PARTITION.

But DROP PARTITION usually moves the data from this partition to other
partitions,
so it is something different.

Andreas

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread Simon Riggs
On Fri, 2007-03-09 at 11:48 +0530, NikhilS wrote:
 Hi, 
 
 On 3/9/07, Shane Ambler [EMAIL PROTECTED] wrote:
 
  Note to Nikhil: Make sure the new syntax doesn't prevent
 partitions from
  being placed upon multiple tablespaces in some manner, at
 CREATE TABLE
  time.
 
 What if the syntax was something like - 
 
 CREATE TABLE tabname (
  ...
  ...
   ) PARTITION BY
   HASH(expr)
 | RANGE(expr)
 | LIST(expr)
 [PARTITIONS num_partitions] /* will apply to HASH only for
 now*/
 [PARTITION partition_name CHECK(...) [USING TABLESPACE
 tblspcname], 
   PARTITION partition_name CHECK(...) [USING TABLESPACE
 tblspcname]
   ...
 ];
 
 
 And (if we use the ALTER TABLE to add partitions)
 
 ALTER TABLE tabname
 ADD PARTITION partition_name CHECK(...)
 [USING TABLESPACE tblspcname];
 
 
 
 We could as well drop the USING part. 

Why would we support HASH partitions?
If you did, the full syntax for hash clusters should be supported.

If we do the CHECK clauses like that then we still have don't have a
guaranteed non-overlap between partitions. It would be easier to use
Oracle syntax and then construct the CHECK clauses from that.

Also, the syntax needs to be fairly complex to allow for a mixture of
modes, e.g. range and list partitioning. That is currently possible
today and the syntax for doing that is IMHO much simpler than the Oracle
simple way of specifying it.

An alternative is to provide a partitioning function which decides which
partition each values goes into.

PARTITION FUNCTION which_partition(date_col)

The partition function must return an unsigned integer  0, which would
correspond to particular partitions. Partitions would be numbered 1..N,
and named tablename_partM where 1 = M = N.

The input and contents of the partition function would be up to the
user. e.g. 

CREATE FUNCTION range_partition(date date_col)
{
if (date_col  D1)
return 1;
else if (date_col  D2)
return 2;
else if (date_col  D3)
return 3;

return 4;
}

Doing it this way would allow us to easily join two tables based upon a
common partition function.

In time, I would suggest we support both ways: declarative and
functional.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread NikhilS

Hi,



Why would we support HASH partitions?
If you did, the full syntax for hash clusters should be supported.



In MySQL, Oracle, the syntax for HASH partitions seems to be similar to the
one mentioned. I do not know much about hash clusters though.

If we do the CHECK clauses like that then we still have don't have a

guaranteed non-overlap between partitions. It would be easier to use
Oracle syntax and then construct the CHECK clauses from that.



Again Oracle, MySQL use VALUES LESS THAN  (expr) format for RANGE
partitions. So you mean that they end up creating ranges like MININT -
Range1, Range1+1 - Range2 etc for each of the partitions?

I think Postgres users are used to the CHECK clauses and I still feel that
the onus of distinct partitions lies on the partition creator.

Also, the syntax needs to be fairly complex to allow for a mixture of

modes, e.g. range and list partitioning. That is currently possible
today and the syntax for doing that is IMHO much simpler than the Oracle
simple way of specifying it.



Subpartitioning is not being targeted right now, but could be put on the
TODO list for further enhancements.

An alternative is to provide a partitioning function which decides which

partition each values goes into.

PARTITION FUNCTION which_partition(date_col)

The partition function must return an unsigned integer  0, which would
correspond to particular partitions. Partitions would be numbered 1..N,
and named tablename_partM where 1 = M = N.

The input and contents of the partition function would be up to the
user. e.g.

CREATE FUNCTION range_partition(date date_col)
{
if (date_col  D1)
return 1;
else if (date_col  D2)
return 2;
else if (date_col  D3)
return 3;

return 4;
}

Doing it this way would allow us to easily join two tables based upon a
common partition function.

In time, I would suggest we support both ways: declarative and
functional.



Till now, we are going the declarative way.

Regards,
Nikhils
--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread Luke Lonergan
Simon,

What happens to the data when the function is dropped or replaced?

- Luke

Msg is shrt cuz m on ma treo

 -Original Message-
From:   Simon Riggs [mailto:[EMAIL PROTECTED]
Sent:   Friday, March 09, 2007 06:20 AM Eastern Standard Time
To: NikhilS
Cc: Shane Ambler; Luke Lonergan; Zeugswetter Andreas ADI SD; Peter 
Eisentraut; pgsql-hackers@postgresql.org
Subject:Re: [HACKERS] Auto creation of Partitions

On Fri, 2007-03-09 at 11:48 +0530, NikhilS wrote:
 Hi, 
 
 On 3/9/07, Shane Ambler [EMAIL PROTECTED] wrote:
 
  Note to Nikhil: Make sure the new syntax doesn't prevent
 partitions from
  being placed upon multiple tablespaces in some manner, at
 CREATE TABLE
  time.
 
 What if the syntax was something like - 
 
 CREATE TABLE tabname (
  ...
  ...
   ) PARTITION BY
   HASH(expr)
 | RANGE(expr)
 | LIST(expr)
 [PARTITIONS num_partitions] /* will apply to HASH only for
 now*/
 [PARTITION partition_name CHECK(...) [USING TABLESPACE
 tblspcname], 
   PARTITION partition_name CHECK(...) [USING TABLESPACE
 tblspcname]
   ...
 ];
 
 
 And (if we use the ALTER TABLE to add partitions)
 
 ALTER TABLE tabname
 ADD PARTITION partition_name CHECK(...)
 [USING TABLESPACE tblspcname];
 
 
 
 We could as well drop the USING part. 

Why would we support HASH partitions?
If you did, the full syntax for hash clusters should be supported.

If we do the CHECK clauses like that then we still have don't have a
guaranteed non-overlap between partitions. It would be easier to use
Oracle syntax and then construct the CHECK clauses from that.

Also, the syntax needs to be fairly complex to allow for a mixture of
modes, e.g. range and list partitioning. That is currently possible
today and the syntax for doing that is IMHO much simpler than the Oracle
simple way of specifying it.

An alternative is to provide a partitioning function which decides which
partition each values goes into.

PARTITION FUNCTION which_partition(date_col)

The partition function must return an unsigned integer  0, which would
correspond to particular partitions. Partitions would be numbered 1..N,
and named tablename_partM where 1 = M = N.

The input and contents of the partition function would be up to the
user. e.g. 

CREATE FUNCTION range_partition(date date_col)
{
if (date_col  D1)
return 1;
else if (date_col  D2)
return 2;
else if (date_col  D3)
return 3;

return 4;
}

Doing it this way would allow us to easily join two tables based upon a
common partition function.

In time, I would suggest we support both ways: declarative and
functional.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com





Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread Simon Riggs
On Fri, 2007-03-09 at 07:40 -0500, Luke Lonergan wrote:

 What happens to the data when the function is dropped or replaced?
 
Well, that wouldn't happen because you build in a dependency.

I'm not working on this, so don't expect lots of detail. The idea is
essentially to implement things the way SQLServer does it.

The function would need some care. It might even need a function that
writes a function.

e.g. 
CreateRangePartitionFunction('{date1,date2,date3}', funcname);
ReplaceRangePartitionFunction('{date1,date2,date3}', funcname);

which would then give a properly designed function called funcname to be
used for partitioning, which would include tests to make sure a
partition wasn't inadvertently excluded from the list of existing
partitions of any table that used it.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread Jim Nasby

On Mar 9, 2007, at 3:31 AM, Zeugswetter Andreas ADI SD wrote:

Since partition is inheritance-based, a simple DROP or  NO

INHERIT

will do the job to deal with the partition. Do we want to reinvent
additional syntax when these are around and are documented?


Well, if the syntax for adding a new partition eventually
ends up as ALTER TABLE ADD PARTITION, then it would make more
sense that you remove a partition via ALTER TABLE DROP PARTITION.


But DROP PARTITION usually moves the data from this partition to other
partitions,
so it is something different.


It does? IIRC every partitioning system I've seen DROP PARTITION  
drops the data as well. It's up to you to move it somewhere else if  
you want to keep it.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread Hannu Krosing
Ühel kenal päeval, R, 2007-03-09 kell 15:41, kirjutas Jim Nasby:
 On Mar 9, 2007, at 3:31 AM, Zeugswetter Andreas ADI SD wrote:
  Since partition is inheritance-based, a simple DROP or  NO
  INHERIT
  will do the job to deal with the partition. Do we want to reinvent
  additional syntax when these are around and are documented?
 
  Well, if the syntax for adding a new partition eventually
  ends up as ALTER TABLE ADD PARTITION, then it would make more
  sense that you remove a partition via ALTER TABLE DROP PARTITION.
 
  But DROP PARTITION usually moves the data from this partition to other
  partitions,
  so it is something different.
 
 It does? IIRC every partitioning system I've seen DROP PARTITION  
 drops the data as well. It's up to you to move it somewhere else if  
 you want to keep it.

Will this proposed DROP PARTITION just disassociate the table from the
master, or will it actually drop the partitions table from the whole
database ?

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread Robert Treat
On Friday 09 March 2007 01:23, NikhilS wrote:
 Hi,

  This follows on from the suggestion I made - taken along the lines of
  the subject auto creation of partitions where I suggested the syntax
  of partition check(month of mydatecol) and have a new partition created
  as data was entered. With this scenario dropping the partition when it
  was empty would complement the creation of a new partition as needed.
 
  Given that there seems to be no real support of going with auto
  maintenance were new partitions are added as needed, then the auto
  dropping of empty partitions would also not apply.
 
  Leaving us with only specific add partition / drop partition commands.
  And have the parent table pick up rows not matching any partition check
  criteria.

 I was thinking along the lines of what Jim had suggested earlier regarding
 overflow partition. Instead of dumping unmatched rows to the master table,
 we could put them into a default DUMP/DUMB partition.


I'm quite content dumping unmatched rows into the master table. This makes it 
very easy to scan partitioned tables for busted partition setups.  Having a 
DUMP tables seems only different semantically, so why learn new semantics? 

 Given that Simon wants to do away with having the master table APPENDed in
 the planning phase, this would be better.


ISTM you're trading appending the master table for appending the DUMP 
partition, which afaict would give you no gain.  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread NikhilS

Hi,

On 3/10/07, Hannu Krosing [EMAIL PROTECTED] wrote:


Ühel kenal päeval, R, 2007-03-09 kell 15:41, kirjutas Jim Nasby:
 On Mar 9, 2007, at 3:31 AM, Zeugswetter Andreas ADI SD wrote:
  Since partition is inheritance-based, a simple DROP or  NO
  INHERIT
  will do the job to deal with the partition. Do we want to reinvent
  additional syntax when these are around and are documented?
 
  Well, if the syntax for adding a new partition eventually
  ends up as ALTER TABLE ADD PARTITION, then it would make more
  sense that you remove a partition via ALTER TABLE DROP PARTITION.
 
  But DROP PARTITION usually moves the data from this partition to other
  partitions,
  so it is something different.

 It does? IIRC every partitioning system I've seen DROP PARTITION
 drops the data as well. It's up to you to move it somewhere else if
 you want to keep it.

Will this proposed DROP PARTITION just disassociate the table from the
master, or will it actually drop the partitions table from the whole
database ?



Thats why I would prefer the existing mechanism, there a DROP on the child
removes it and a NO INHERIT disassociates it. There might be situations
where we would want to just disassociate and not drop.

Regards,
Nikhils

--


Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com






--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread NikhilS

Hi,


 Given that Simon wants to do away with having the master table APPENDed
in
 the planning phase, this would be better.


ISTM you're trading appending the master table for appending the DUMP
partition, which afaict would give you no gain.




If there are entries in the master table, I think it would get appended for
all queries regardless of whether we need to examine its contents or not.
Segregating dump data into a partition will avoid that.

I have seen examples in some other databases wherein a partition specifies a
range of someval - MAXINT for instance, to catch such cases.

That again means that the onus is on the partition creator most of the
times..
Regards,
Nikhils




--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL





--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread NikhilS

Hi,

Maybe, the dummy entry could be extended to contain the bounds (max/min) for
each of the other involved partitions and they could be updated each time a
DML happens across the partitions. That ways, an update to a particular
partition needs to lock out the others, examine the dummy entries in its own
index and follow it up with dummy entries update into other partitions if
the need be.

Ofcourse as you have mentioned all of this so needs to be done after a
careful think on the locking/deadlocking etc issues.

Regards,
Nikhils

On 3/7/07, Alvaro Herrera [EMAIL PROTECTED] wrote:


I am wondering if we can implement unique indexes across several tables
(inheritance hierarchy) not by using a single, big index covering all
the tables, but rather by inserting a dummy entry into each partition's
unique index.  This dummy entry would have an expanded CTID which would
include the tableoid, so it's possible to check it (albeit there is a
problem in that we may require the opening of another heap to do the
actual checking).  These dummy entries could be removed by bulkcleanup
as soon as the inserting transaction is no longer running, to avoid
bloating the index too much.  All said dummy index entries would be
located at either the rightmost or the leftmost leaf, or close to it, so
another idea is to have future inserters reuse the entry for a different
key.

The obvious problem with this is, naturally, the excess I/O that extra
index traversing causes.  The not so obvious ones are locking,
deadlocking and the opening of other heaps and indexes while you do the
insertion, which may be too expensive.  On the other hand, maybe this
idea is easier to implement than full-fledged cross-table indexes, so we
could have richer partitioning earlier than when somebody finally bites
the bullet and implements cross-table indexes.

Or maybe this is just a dumb idea, but I had to let it out anyway :-)

--
Alvaro Herrera
http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.





--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread NikhilS

Hi,



There are 2 other reasons to favor triggers though:

1) People (Josh Drake comes to mind) have found that if you get over
a tiny number of partitions, the performance of rules is abysmal.

2) I believe it should be possible to construct an update trigger
that allows you to perform updates that will place the row in
question into a new partition. While I can see cases for simply
disallowing updates to the partitioning key, I think there are also
times when being able to do that would be very useful.



The consensus seems to be veering towards triggers.



I think it'd be great to make adding and removing partitions as
simple as ALTER TABLE. I don't think that DELETE should be the
mechanism to drop a partition, though. Again, DML statements
shouldn't be performing DDL.



Since partition is inheritance-based, a simple DROP or  NO INHERIT will do
the job to deal with the partition. Do we want to reinvent additional syntax
when these are around and are documented?

Regards,
Nikhils
--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Simon Riggs
On Wed, 2007-03-07 at 22:32 -0500, Luke Lonergan wrote:
 Andreas,
 
 On 3/7/07 11:45 AM, Zeugswetter Andreas ADI SD [EMAIL PROTECTED]
 wrote:
 
  Whoa, do you have anything to back that up ?
 
 Sure - when we start to consider designs that implement advanced data
 management features, we run into problems with the architecture of
 tables-tables-tables  Here are some examples:
 1 - people think of partitions as a logical building block for tables, they
 would like to move partitions around underneath a table without the table
 definition being involved.  In the current implementation, there are
 explicit linkages between the table definition and the child tables -
 imagine an ALTER TABLE foo_parent ADD COLUMN and how it would need to
 cascade to 1,000 child tables and you get the beginning of it - this
 connection should not exist.

The inheritance can work at multiple levels, so its up to you how you
manage things. If you want to add a column to only the newest tables,
you just add a new mid-level table, add the new column only to that and
then make all new partitions inherit from that table rather than the
main table. So I don't see the objection here, I see a benefit.

 2 - INSERT/UPDATE/DELETE processing through the SQL rewrite layer (rules) is
 terribly slow and gets slower as you add more partitions.  If done closer to
 the storage layer, this can be done in ways that use access methods shared
 with other storage entities, e.g. Indices, and the code path would flow more
 naturally.

That bit is just syntactic sugar and unnecessary, for most applications,
in my experience - especially because it doesn't work with COPY.

People do seem to want it, so as I said upthread, we need a way of
speeding up the selection of the appropriate partition, so we can get
this to work for individual INSERTs.

This needs substantial improvement, but I don't see this challenging the
partition == table assumption.

 3 - Parallel query can be accomplished more easily by separating scans
 across relations split among tablespaces.  This is more natural than trying
 to parallelize APPEND nodes within existing plans

Tables- Partitions - Tablespaces

Changing Partitions from Tables to SomethingElse won't increase the
current capability to define the disk layout for concurrent I/O.

Parallel Query needs to be planner-aware, so if we change partitions
from being tables, then we'd need to reintroduce them to the planner.
APPEND is clearly not the only thing that's needed for parallel query.

Note to Nikhil: Make sure the new syntax doesn't prevent partitions from
being placed upon multiple tablespaces in some manner, at CREATE TABLE
time.

  You would need to elaborate what you actually mean, but I think it is
  moot.
  Sure, the constraint technique can be further extended (e.g. during
  runtime), but imho the approach is very good.
 
 Well, it's being used and that's good, but it needs to be better IMO and I
 think that before we go too far down the current path we should consider the
 alternatives more carefully.

I'm happy to reconsider things, but we need to do that with some clear
analysis of what doesn't work yet and how best to implement that.

None of what's been mentioned requires us to reconsider the 
Partition == Table assumption. There were other ways considered, but
they didn't win out in the analysis, for documented reasons.

If there are benefits to having partitions act like tables, then maybe
we can make them behave differently in just those circumstances. Pack
animals behave differently in a group, so why not tables, if need be?
But when is that exactly?

There is a ton of work to make partitioning the elegant beast we'd like
it to be, but that seems like extension only, not rip and replace.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Simon Riggs
On Wed, 2007-03-07 at 21:27 -0700, Jim Nasby wrote:
 On Mar 7, 2007, at 3:26 AM, Simon Riggs wrote:
  If you know that the constraints on each of the tables is distinct,  
  then
  building a UNIQUE index on each of the partitions is sufficient to  
  prove
  that all rows in the combined partitioned table are distinct also.
 
  The hard part there is checking that the partition constraints are
  distinct. If the partition constraints are added one at a time, you  
  can
  use the predicate testing logic to compare the to-be-added partition's
  constraint against each of the already added constraints. That becomes
  an O(N) problem.
 
  What is really needed is a data structure that allows range partitions
  to be accessed more efficiently. This could make adding partitions and
  deciding in which partition a specific value goes an O(logN)  
  operation.
 
 Directing data to child tables with triggers pretty much necessitates  
 having some way to codify what partition a particular row belongs in.  
 IE: for partitioning by month, you'll see things like naming the  
 partition tables parent_table_name_$YEAR_$MONTH, so the  
 'partitioning function' takes a date or timestamp and then returns  
 what partition it belongs to. Perhaps there is some way to use that  
 mapping to drive the selection of what partitions could contain a  
 given value?
 
 One possibility would be to require 3 functions for a partitioned  
 table: one accepts the partitioning key and tells you what partition  
 it's in, one that tells you what the minimum partitioning key for a  
 partition would be, and one that tells you what the maximum would be.  
 If the user supplied those 3 functions, I think it would be possibly  
 to automatically generate code for the triggers and check  
 constraints. The min/max partition key functions might allow you to  
 more efficiently do partition elimination, too.

ISTM this is a good idea.

SQLServer uses partitioning functions and I like that approach. It makes
it much easier to do partition-wise joins between tables that share
partitioning functions.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Gregory Stark
Jim Nasby [EMAIL PROTECTED] writes:

 One possibility would be to require 3 functions for a partitioned table: one
 accepts the partitioning key and tells you what partition  it's in, one that
 tells you what the minimum partitioning key for a  partition would be, and one
 that tells you what the maximum would be.  If the user supplied those 3
 functions, I think it would be possibly  to automatically generate code for 
 the
 triggers and check  constraints. The min/max partition key functions might
 allow you to  more efficiently do partition elimination, too.

But then it would be harder to tell whether a clause implied a given
partition. That is, if you have a partition constraint of col OP const then
we can test whether a query clause of col OP2 const2 implies that constraint
when planning (or actually whether it implies it's false to exclude the
partition). If you have a constraint like P1(const) it'll be pretty hard to
do much with that.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Zeugswetter Andreas ADI SD
  The hard part there is checking that the partition constraints are 
  distinct. If the partition constraints are added one at a time, you 
  can use the predicate testing logic to compare the to-be-added 
  partition's constraint against each of the already added
constraints. 
  That becomes an O(N) problem.

Yes, we could preevaluate that check (per index) in the DDL phase and
keep the info in a flag. 
Also interesting info is if there is an order the partitions can be read
in to satisfy a particular order by.

  What is really needed is a data structure that allows range
partitions 
  to be accessed more efficiently. This could make adding partitions
and 
  deciding in which partition a specific value goes an O(logN) 
  operation.

I do not really see a problem with O(N) since typical N currently range
from 10 to 200.
N = 1000 is already good for a = 10 TB table.
If a 10 GB partition were too large we should imho invest more in the
advanced indexing methods that are currently beeing developed.

 Directing data to child tables with triggers pretty much 
 necessitates having some way to codify what partition a 
 particular row belongs in.  
 IE: for partitioning by month, you'll see things like naming 
 the partition tables parent_table_name_$YEAR_$MONTH, so the 
 'partitioning function' takes a date or timestamp and then 
 returns what partition it belongs to. Perhaps there is some 
 way to use that mapping to drive the selection of what 
 partitions could contain a given value?

You put it in the first partition that has matching constraints.

 One possibility would be to require 3 functions for a partitioned
 table: one accepts the partitioning key and tells you what 
 partition it's in, one that tells you what the minimum 
 partitioning key for a partition would be, and one that tells 
 you what the maximum would be.  
 If the user supplied those 3 functions, I think it would be 
 possibly to automatically generate code for the triggers and 
 check constraints. The min/max partition key functions might 
 allow you to more efficiently do partition elimination, too.

I can see this as a good optional addition, but it can only be optional
else it would pretty much limit the methods that can be used for
partitioning. e.g. hash, modulo do not have a min,max per partition.

Andreas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Jim C. Nasby
On Thu, Mar 08, 2007 at 01:52:27PM +0530, NikhilS wrote:
 I think it'd be great to make adding and removing partitions as
 simple as ALTER TABLE. I don't think that DELETE should be the
 mechanism to drop a partition, though. Again, DML statements
 shouldn't be performing DDL.
 
 
 Since partition is inheritance-based, a simple DROP or  NO INHERIT will do
 the job to deal with the partition. Do we want to reinvent additional syntax
 when these are around and are documented?

Well, if the syntax for adding a new partition eventually ends up as
ALTER TABLE ADD PARTITION, then it would make more sense that you remove
a partition via ALTER TABLE DROP PARTITION.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Jim C. Nasby
On Thu, Mar 08, 2007 at 10:12:30AM +, Gregory Stark wrote:
 Jim Nasby [EMAIL PROTECTED] writes:
 
  One possibility would be to require 3 functions for a partitioned table: one
  accepts the partitioning key and tells you what partition  it's in, one that
  tells you what the minimum partitioning key for a  partition would be, and 
  one
  that tells you what the maximum would be.  If the user supplied those 3
  functions, I think it would be possibly  to automatically generate code for 
  the
  triggers and check  constraints. The min/max partition key functions might
  allow you to  more efficiently do partition elimination, too.
 
 But then it would be harder to tell whether a clause implied a given
 partition. That is, if you have a partition constraint of col OP const then
 we can test whether a query clause of col OP2 const2 implies that constraint
 when planning (or actually whether it implies it's false to exclude the
 partition). If you have a constraint like P1(const) it'll be pretty hard to
 do much with that.

Well, you could tell what partition 'const' was in; I would think that
plus knowledge about OP2 would allow you to decide what partitions you
need to look at.

There's also nothing to prevent us from also adding the constraints and
using constraint exclusion as well. In fact, I think we'd want to have
the constraints just so we know that a given partition only contains the
data we want it to.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Jim C. Nasby
One other thought... a lot of this discussion seems to parallel the
debate from a while ago about whether SERIAL should act like a macro
(ie: it just sets everything up and users are free to monkey under the
hood afterwards), or whether it should be it's own 'closed-box'
construct.

Currently, we seem to be leaning towards partition management being a
'macro', with child tables very exposed, etc. I don't know if that's
good or bad, but it's probably worth some thought.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Shane Ambler



Note to Nikhil: Make sure the new syntax doesn't prevent partitions from
being placed upon multiple tablespaces in some manner, at CREATE TABLE
time.


What if the syntax was something like -

CREATE TABLE tabname (
...
...
 ) PARTITION BY
 HASH(expr)
| RANGE(expr)
| LIST(expr)
[PARTITIONS num_partitions] /* will apply to HASH only for now*/
[PARTITION partition_name CHECK(...) [USING TABLESPACE tblspcname],
 PARTITION partition_name CHECK(...) [USING TABLESPACE tblspcname]
 ...
];


And (if we use the ALTER TABLE to add partitions)

ALTER TABLE tabname
ADD PARTITION partition_name CHECK(...)
[USING TABLESPACE tblspcname];


Of course ALTER TABLE childtable SET TABLESPACE tblspcname; should not 
cause any probs.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Shane Ambler

Jim C. Nasby wrote:

On Thu, Mar 08, 2007 at 01:52:27PM +0530, NikhilS wrote:

I think it'd be great to make adding and removing partitions as
simple as ALTER TABLE. I don't think that DELETE should be the
mechanism to drop a partition, though. Again, DML statements
shouldn't be performing DDL.


Since partition is inheritance-based, a simple DROP or  NO INHERIT will do
the job to deal with the partition. Do we want to reinvent additional syntax
when these are around and are documented?


Well, if the syntax for adding a new partition eventually ends up as
ALTER TABLE ADD PARTITION, then it would make more sense that you remove
a partition via ALTER TABLE DROP PARTITION.


This follows on from the suggestion I made - taken along the lines of 
the subject auto creation of partitions where I suggested the syntax 
of partition check(month of mydatecol) and have a new partition created 
as data was entered. With this scenario dropping the partition when it 
was empty would complement the creation of a new partition as needed.


Given that there seems to be no real support of going with auto 
maintenance were new partitions are added as needed, then the auto 
dropping of empty partitions would also not apply.


Leaving us with only specific add partition / drop partition commands.
And have the parent table pick up rows not matching any partition check 
criteria.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread NikhilS

Hi,

On 3/9/07, Shane Ambler [EMAIL PROTECTED] wrote:



 Note to Nikhil: Make sure the new syntax doesn't prevent partitions from
 being placed upon multiple tablespaces in some manner, at CREATE TABLE
 time.

What if the syntax was something like -

CREATE TABLE tabname (
 ...
 ...
  ) PARTITION BY
  HASH(expr)
| RANGE(expr)
| LIST(expr)
[PARTITIONS num_partitions] /* will apply to HASH only for now*/
[PARTITION partition_name CHECK(...) [USING TABLESPACE tblspcname],
  PARTITION partition_name CHECK(...) [USING TABLESPACE tblspcname]
  ...
];


And (if we use the ALTER TABLE to add partitions)

ALTER TABLE tabname
ADD PARTITION partition_name CHECK(...)
[USING TABLESPACE tblspcname];



We could as well drop the USING part.

Regards,
Nikhils
--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread NikhilS

Hi,



This follows on from the suggestion I made - taken along the lines of
the subject auto creation of partitions where I suggested the syntax
of partition check(month of mydatecol) and have a new partition created
as data was entered. With this scenario dropping the partition when it
was empty would complement the creation of a new partition as needed.

Given that there seems to be no real support of going with auto
maintenance were new partitions are added as needed, then the auto
dropping of empty partitions would also not apply.

Leaving us with only specific add partition / drop partition commands.
And have the parent table pick up rows not matching any partition check
criteria.



I was thinking along the lines of what Jim had suggested earlier regarding
overflow partition. Instead of dumping unmatched rows to the master table,
we could put them into a default DUMP/DUMB partition.

Given that Simon wants to do away with having the master table APPENDed in
the planning phase, this would be better.

Regards,
Nikhils


--


Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz





--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Peter Eisentraut
Am Mittwoch, 7. März 2007 07:44 schrieb NikhilS:
 Sure, but as Chris mentioned earlier, wouldn't it be useful to maintain
 uniqueness on a
 partition-by-partition basis too?

Many things might be useful, but the aim of the table partitioning venture 
is believed to be the provision of a transparent interface to the existing 
do-it-yourself partitioning facilities.  Therefore, the logical definition of  
a table must continue to work unchanged (or alternatively, the use of the 
feature must be prohibited if that cannot be guaranteed in a particular 
case).  Other features such as uniqueness on a partition basis might also be 
useful but you can do that today and you don't need partitioning for that.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Simon Riggs
On Tue, 2007-03-06 at 18:31 +0530, NikhilS wrote:

 On 3/6/07, Peter Eisentraut [EMAIL PROTECTED] wrote:
 NikhilS wrote:
  iv) Based on the PRIMARY, UNIQUE, REFERENCES information
 specified,
  pass it on to the children tables.
 
 How will you maintain a primary key in such a table,
 considering that
 indexes can't span multiple tables?
 
 We will not (I know its a hard thing to do :) ), the intention is to
 use this information from the parent and make it a property of the
 child table. This will avoid the step for the user having to manually
 specify CREATE INDEX and the likes on all the children tables
 one-by-one. 

If you know that the constraints on each of the tables is distinct, then
building a UNIQUE index on each of the partitions is sufficient to prove
that all rows in the combined partitioned table are distinct also.

The hard part there is checking that the partition constraints are
distinct. If the partition constraints are added one at a time, you can
use the predicate testing logic to compare the to-be-added partition's
constraint against each of the already added constraints. That becomes
an O(N) problem.

What is really needed is a data structure that allows range partitions
to be accessed more efficiently. This could make adding partitions and
deciding in which partition a specific value goes an O(logN) operation.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread NikhilS

Hi,

If you know that the constraints on each of the tables is distinct, then

building a UNIQUE index on each of the partitions is sufficient to prove
that all rows in the combined partitioned table are distinct also.

The hard part there is checking that the partition constraints are
distinct. If the partition constraints are added one at a time, you can
use the predicate testing logic to compare the to-be-added partition's
constraint against each of the already added constraints. That becomes
an O(N) problem.

What is really needed is a data structure that allows range partitions
to be accessed more efficiently. This could make adding partitions and
deciding in which partition a specific value goes an O(logN) operation.



If the unique constraint is supposed to be on a column which is NOT being
used for the partitioning, then all the above becomes much more difficult.
While partitioning, the additional onus on the user is to specify
non-conflicting CHECKs for the range/list partitions.

Regards,
Nikhils

--

  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com






--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Zeugswetter Andreas ADI SD

  iv) Based on the PRIMARY, UNIQUE, REFERENCES information specified, 
  pass it on to the children tables.
 
 How will you maintain a primary key in such a table, 
 considering that indexes can't span multiple tables?

Many partitioning schemes have (or can be made to have) only one
possible target partition for the primary key. Thus if you create
separate unique indexes on each partition the problem is solved.
For a first version I opt, that it is sufficient to disallow creation of
a unique index on the master, when the constraints (on columns of this
index) do not nail down a specific partition for each row (e.g. a hash
or a range on one of the index columns that does not overlap).
Later, global index, or indexes with separate partitioning rules can be
implemented, that cover the other cases.

Andreas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread NikhilS

Hi,

On 3/7/07, Zeugswetter Andreas ADI SD [EMAIL PROTECTED] wrote:



  iv) Based on the PRIMARY, UNIQUE, REFERENCES information specified,
  pass it on to the children tables.

 How will you maintain a primary key in such a table,
 considering that indexes can't span multiple tables?

Many partitioning schemes have (or can be made to have) only one
possible target partition for the primary key. Thus if you create
separate unique indexes on each partition the problem is solved.
For a first version I opt, that it is sufficient to disallow creation of
a unique index on the master, when the constraints (on columns of this
index) do not nail down a specific partition for each row (e.g. a hash
or a range on one of the index columns that does not overlap).
Later, global index, or indexes with separate partitioning rules can be
implemented, that cover the other cases.

Andreas



Yes, I agree. For version 1, UNIQUE/PRIMARY indexes will cascade down to the
child table, only if the indexed column is present as part of the
partitioning rule.

Regards,
Nikhils

--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Alvaro Herrera
I am wondering if we can implement unique indexes across several tables
(inheritance hierarchy) not by using a single, big index covering all
the tables, but rather by inserting a dummy entry into each partition's
unique index.  This dummy entry would have an expanded CTID which would
include the tableoid, so it's possible to check it (albeit there is a
problem in that we may require the opening of another heap to do the
actual checking).  These dummy entries could be removed by bulkcleanup
as soon as the inserting transaction is no longer running, to avoid
bloating the index too much.  All said dummy index entries would be
located at either the rightmost or the leftmost leaf, or close to it, so
another idea is to have future inserters reuse the entry for a different
key.

The obvious problem with this is, naturally, the excess I/O that extra
index traversing causes.  The not so obvious ones are locking,
deadlocking and the opening of other heaps and indexes while you do the
insertion, which may be too expensive.  On the other hand, maybe this
idea is easier to implement than full-fledged cross-table indexes, so we
could have richer partitioning earlier than when somebody finally bites
the bullet and implements cross-table indexes.

Or maybe this is just a dumb idea, but I had to let it out anyway :-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Luke Lonergan
Simon,

On 3/7/07 5:26 AM, Simon Riggs [EMAIL PROTECTED] wrote:

 What is really needed is a data structure that allows range partitions
 to be accessed more efficiently. This could make adding partitions and
 deciding in which partition a specific value goes an O(logN) operation.

I think we need to re-evaluate the inheritance mechanism for partitioning
and implement something much closer to the storage layer, similar to Oracle.

Having the constraint exclusion occur in the planner is not flexible enough
to allow more advanced solutions.

- Luke 



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Simon Riggs
On Wed, 2007-03-07 at 10:23 -0500, Luke Lonergan wrote:
 Simon,
 
 On 3/7/07 5:26 AM, Simon Riggs [EMAIL PROTECTED] wrote:
 
  What is really needed is a data structure that allows range partitions
  to be accessed more efficiently. This could make adding partitions and
  deciding in which partition a specific value goes an O(logN) operation.
 
 I think we need to re-evaluate the inheritance mechanism for partitioning
 and implement something much closer to the storage layer, similar to Oracle.

Oracle's implementation is fairly close to ours, actually. Each
partition is a segment. SQLServer's is fairly low level.

 Having the constraint exclusion occur in the planner is not flexible enough
 to allow more advanced solutions.

It depends what those advanced solutions are. I definitely want to be
able to do run-time exclusion, push down merge joins and parallelism,
but I also think that being able to index only the first 3 partitions is
a useful feature too.

ISTM that making the Append node responsible for exclusion might be a
way to go with this, but various ways are possible, I'm sure. The trick
is to find one that does everything you need and that will take some
deeper analysis.

However you do it, you'll still need a way of deciding which partitions
to include/exclude that doesn't involve a sequential scan of all
partition constraints. So my comments about a different index structure
are still going to be relevant, wherever that lives/executes.

I'm not doing anything in this area personally for 8.3 though.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Zeugswetter Andreas ADI SD

  What is really needed is a data structure that allows range
partitions 
  to be accessed more efficiently. This could make adding partitions
and 
  deciding in which partition a specific value goes an O(logN)
operation.
 
 I think we need to re-evaluate the inheritance mechanism for 
 partitioning and implement something much closer to the 
 storage layer, similar to Oracle.
 
 Having the constraint exclusion occur in the planner is not 
 flexible enough to allow more advanced solutions.

Whoa, do you have anything to back that up ?
You would need to elaborate what you actually mean, but I think it is
moot.
Sure, the constraint technique can be further extended (e.g. during
runtime), but imho the approach is very good.

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Joris Dobbelsteen
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Alvaro Herrera
Sent: woensdag 7 maart 2007 15:59
To: NikhilS
Cc: Zeugswetter Andreas ADI SD; Peter Eisentraut; 
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Auto creation of Partitions

I am wondering if we can implement unique indexes across 
several tables (inheritance hierarchy) not by using a single, 
big index covering all the tables, but rather by inserting a 
dummy entry into each partition's unique index.

I think that one big index is much better in this case. You are already
replicating the data and need a lot of work when updating the partition
structure (e.g. removing one).

[snip]

Rather:
If we have the partition relations (aka partition), then we can define a
unique index on it. This guarentees that there is at most one tuple with
the same value (for the specified columns) for every individual
partition.

Now for the inserts.
We already guarentee that is unique within the partition it lives. So we
must insert it first.
Next its to check the other partitions (in order!) for existence of a
row with a similar context. Of course we require full visibility of the
data.
We do the insert first, as this ensures other will find it. The order is
also important, otherwise there can be situations where we can't
guarentee the constraint.

Updates are exactly the same.
Deletes are trivial.

Perhaps you can do it without an index, but this is nothing different
than for a normal table.

- Joris Dobbelsteen

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Luke Lonergan
Andreas,

On 3/7/07 11:45 AM, Zeugswetter Andreas ADI SD [EMAIL PROTECTED]
wrote:

 Whoa, do you have anything to back that up ?

Sure - when we start to consider designs that implement advanced data
management features, we run into problems with the architecture of
tables-tables-tables  Here are some examples:
1 - people think of partitions as a logical building block for tables, they
would like to move partitions around underneath a table without the table
definition being involved.  In the current implementation, there are
explicit linkages between the table definition and the child tables -
imagine an ALTER TABLE foo_parent ADD COLUMN and how it would need to
cascade to 1,000 child tables and you get the beginning of it - this
connection should not exist.

2 - INSERT/UPDATE/DELETE processing through the SQL rewrite layer (rules) is
terribly slow and gets slower as you add more partitions.  If done closer to
the storage layer, this can be done in ways that use access methods shared
with other storage entities, e.g. Indices, and the code path would flow more
naturally.

3 - Parallel query can be accomplished more easily by separating scans
across relations split among tablespaces.  This is more natural than trying
to parallelize APPEND nodes within existing plans

 You would need to elaborate what you actually mean, but I think it is
 moot.
 Sure, the constraint technique can be further extended (e.g. during
 runtime), but imho the approach is very good.

Well, it's being used and that's good, but it needs to be better IMO and I
think that before we go too far down the current path we should consider the
alternatives more carefully.

- Luke



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Jim Nasby

On Mar 6, 2007, at 9:13 AM, Shane Ambler wrote:

NikhilS wrote:

On 3/6/07, Peter Eisentraut [EMAIL PROTECTED] wrote:

1) Whether we should use triggers/rules for step number (iii) above.
 Maybe rules is the way to go.

Since this would basically be a case of the updatable rules  
problem, you
should review those discussions in the past to check whether the  
issues

mentioned there don't interfere with that plan.

The rules mentioned here will be to specify that all the
inserts/updates/deletes should go into proper children tables  
instead of the
parent. I do not see the updateable rules problem with regards to  
this, but
will check out the archives for discussion on this related to  
partitioning.


I would think that a trigger would be a better choice as I see the  
need (or at least the possibility) for more than just a rewrite.  
When a value is inserted that is outside of a condition currently  
covered by an existing child table then a new child will need to be  
spawned to contain the new data.


There's no reason a new child has to be spawned, and I don't think  
that having a DML statement 'automagically' generating DDL is such a  
hot idea, either.


Also, there's nothing inherently wrong with having an 'overflow  
partition' (to use Oracle syntax) that holds values that don't fall  
in the range of any other tables. The obvious place to do that with  
our partitioning is in the parent table.


There are 2 other reasons to favor triggers though:

1) People (Josh Drake comes to mind) have found that if you get over  
a tiny number of partitions, the performance of rules is abysmal.


2) I believe it should be possible to construct an update trigger  
that allows you to perform updates that will place the row in  
question into a new partition. While I can see cases for simply  
disallowing updates to the partitioning key, I think there are also  
times when being able to do that would be very useful.


Will ALTER TABLE be extended to handle partitions? This will allow  
partitioning existing tables (maybe without blocking access?) and  
allow things like ALTER TABLE mytable ADD PARTITION (mydate within  
200703)
and ALTER TABLE mytable DROP PARTITION (mydate within 199912) or  
would dropping be covered by DELETE FROM mytable where mydate =  
199912 ?


I think it'd be great to make adding and removing partitions as  
simple as ALTER TABLE. I don't think that DELETE should be the  
mechanism to drop a partition, though. Again, DML statements  
shouldn't be performing DDL.


Could such a syntax be devised for date columns? (month of mydate)  
or similar to auto create partitions based on the year and month of  
a date column? or will we just do CHECK(mydatecol = 1/3/07 and  
mydatecol = 31/3/07) for each month of data? Also (day of  
mydatecol) to partition based on year and day of year.


Another syntax possibility - range(myserialcol of 50) where new  
child tables are created every 50 rows?


Maybe I'm looking at auto-maintenance which is beyond any current  
planning?


I don't think it's worth it to burden the database with auto-creating  
time partitions; it's easy enough to setup a cron job to handle it.


It might be more useful to have the database handle things like  
partitioning on a SERIAL column, though I agree with Nikhils that  
this should wait.


Does any other database support 'automagic' partition creation? I  
know Oracle 9i doesn't... not sure about 10g or DB2 or MSSQL...

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Jim Nasby

On Mar 6, 2007, at 4:57 AM, NikhilS wrote:
iv) Based on the PRIMARY, UNIQUE, REFERENCES information specified,  
pass it on to the children tables.


Since we want to eventually support 'global' indexes, I think we need  
to be really careful here. There's actually 3 possible cases:


1) Index* should be global (spanning multiple tables)
2) Index* should be inherited by all partitions as they're created
3) Index* should exist only on the parent table

* Note that there's really no reason this has to be limit to indexes;  
it could certainly apply to constraints, or even triggers.


IIRC, Oracle has a special syntax for global indexes; any other index  
defined on a partitioned table is picked up *by newly created  
partitions*. If you want to add indexes to existing partitions, you  
have to explicitly add it to each partition.


I'd like to eventually see us supporting all 3 options. I'm not sure  
how much we want to clutter the grammar, though.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Jim Nasby

On Mar 7, 2007, at 3:26 AM, Simon Riggs wrote:
If you know that the constraints on each of the tables is distinct,  
then
building a UNIQUE index on each of the partitions is sufficient to  
prove

that all rows in the combined partitioned table are distinct also.

The hard part there is checking that the partition constraints are
distinct. If the partition constraints are added one at a time, you  
can

use the predicate testing logic to compare the to-be-added partition's
constraint against each of the already added constraints. That becomes
an O(N) problem.

What is really needed is a data structure that allows range partitions
to be accessed more efficiently. This could make adding partitions and
deciding in which partition a specific value goes an O(logN)  
operation.


Directing data to child tables with triggers pretty much necessitates  
having some way to codify what partition a particular row belongs in.  
IE: for partitioning by month, you'll see things like naming the  
partition tables parent_table_name_$YEAR_$MONTH, so the  
'partitioning function' takes a date or timestamp and then returns  
what partition it belongs to. Perhaps there is some way to use that  
mapping to drive the selection of what partitions could contain a  
given value?


One possibility would be to require 3 functions for a partitioned  
table: one accepts the partitioning key and tells you what partition  
it's in, one that tells you what the minimum partitioning key for a  
partition would be, and one that tells you what the maximum would be.  
If the user supplied those 3 functions, I think it would be possibly  
to automatically generate code for the triggers and check  
constraints. The min/max partition key functions might allow you to  
more efficiently do partition elimination, too.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] Auto creation of Partitions

2007-03-06 Thread NikhilS

Hi,

This is to get feedback to meet the following TODO:

  - Simplify ability to create partitioned tables

  This would allow creation of partitioned tables without requiring
  creation of rules for INSERT/UPDATE/DELETE, and constraints for rapid
  partition selection. Options could include range and hash partition
  selection.


There was some discussion on the pgsql mailing lists, which lead to the
above TODO:
http://archives.postgresql.org/pgsql-hackers/2006-09/msg00189.php
http://archives.postgresql.org/pgsql-hackers/2006-08/msg01874.php

We can have the following syntax to support auto creation of partitions in
Postgresql:

CREATE TABLE tabname (
...
...
 ) PARTITION BY
 HASH(expr)
| RANGE(expr)
| LIST(expr)
[PARTITIONS num_partitions] /* will apply to HASH only for now*/
[PARTITION partition_name CHECK(...),
 PARTITION partition_name CHECK(...)
 ...
];

Here expr will be one of the column names as specified for the master
table. Once we finalize the syntax, the above statement would end up
carrying out the following activities (disclaimer: we might add or remove
some activities based on the discussion here).

i ) Create master table.
ii) Create children tables based on the number of partitions specified and
make them inherit from the master table.
iii) Auto generate rules (or triggers?) using the checks mentioned for the
partitions, to handle INSERTs/DELETEs/UPDATEs to navigate them to the
appropriate child. Note that checks specified directly on the master table
will get inherited automatically.
iv) Based on the PRIMARY, UNIQUE, REFERENCES information specified, pass it
on to the children tables.
v) If possible add CHECK (false) to the master table to avoid any activity
on it.

Some questions remain as to:

1) Whether we should use triggers/rules for step number (iii) above. Maybe
rules is the way to go.
2) What other attributes (access permissions e.g.) of the master along with
the ones specified in (iv) should be passed on to the children.
3) Some implementation specific issue e.g. whether SPI_execute would be a
good way of creating these rules.

Comments appreciated,
Regards,
Nikhils
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Peter Eisentraut
NikhilS wrote:
 iv) Based on the PRIMARY, UNIQUE, REFERENCES information specified,
 pass it on to the children tables.

How will you maintain a primary key in such a table, considering that 
indexes can't span multiple tables?

 1) Whether we should use triggers/rules for step number (iii) above.
 Maybe rules is the way to go.

Since this would basically be a case of the updatable rules problem, you 
should review those discussions in the past to check whether the issues 
mentioned there don't interfere with that plan.

 2) What other attributes (access permissions e.g.) of the master
 along with the ones specified in (iv) should be passed on to the
 children.

Moreover, how are later changes of those attributes propagated?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread NikhilS

Hi,

On 3/6/07, Peter Eisentraut [EMAIL PROTECTED] wrote:


NikhilS wrote:
 iv) Based on the PRIMARY, UNIQUE, REFERENCES information specified,
 pass it on to the children tables.

How will you maintain a primary key in such a table, considering that
indexes can't span multiple tables?



We will not (I know its a hard thing to do :) ), the intention is to use
this information from the parent and make it a property of the child table.
This will avoid the step for the user having to manually specify CREATE
INDEX and the likes on all the children tables one-by-one.


1) Whether we should use triggers/rules for step number (iii) above.
 Maybe rules is the way to go.

Since this would basically be a case of the updatable rules problem, you
should review those discussions in the past to check whether the issues
mentioned there don't interfere with that plan.



The rules mentioned here will be to specify that all the
inserts/updates/deletes should go into proper children tables instead of the
parent. I do not see the updateable rules problem with regards to this, but
will check out the archives for discussion on this related to partitioning.


2) What other attributes (access permissions e.g.) of the master
 along with the ones specified in (iv) should be passed on to the
 children.

Moreover, how are later changes of those attributes propagated?



Once created, this will be a normal inheritance relationship between the
tables and all the existing commands will apply to both the parent and the
child.

The basic idea here is to automate as many things as possible at partition
creation time. The user is free to make additional changes to the involved
tables later too.

Regards,
Nikhils

--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Gregory Stark
NikhilS [EMAIL PROTECTED] writes:

the intention is to use this information from the parent and make it a
property of the child table. This will avoid the step for the user having to
manually specify CREATE INDEX and the likes on all the children tables
one-by-one.

Missed the start of this thread. A while back I had intended to add WITH
INDEXES to CREATE TABLE LIKE. That would let you create a tale LIKE parent
WITH CONSTRAINTS WITH INDEXES and get pretty much a perfect table ready for
adding to the inheritance structure.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread NikhilS

On 3/6/07, NikhilS [EMAIL PROTECTED] wrote:


Hi,

On 3/6/07, Gregory Stark [EMAIL PROTECTED] wrote:

 NikhilS [EMAIL PROTECTED] writes:

 the intention is to use this information from the parent and make it a
 property of the child table. This will avoid the step for the user
 having to
 manually specify CREATE INDEX and the likes on all the children tables
 one-by-one.

 Missed the start of this thread. A while back I had intended to add WITH
 INDEXES to CREATE TABLE LIKE. That would let you create a tale LIKE
 parent
 WITH CONSTRAINTS WITH INDEXES and get pretty much a perfect table ready
 for
 adding to the inheritance structure.


Yeah, this one aims to do pretty much the above as part of the auto
creation of the inheritance-based partitions.



And to add, maybe if there is consensus/demand for the WITH INDEXES idea
mentioned above too, I could work on it as well.

Regards,
Nikhils

Regards,

Nikhils


--
   Gregory Stark
   EnterpriseDB   http://www.enterprisedb.com




--
EnterpriseDB   http://www.enterprisedb.com





--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread NikhilS

Hi,

On 3/6/07, Gregory Stark [EMAIL PROTECTED] wrote:


NikhilS [EMAIL PROTECTED] writes:

the intention is to use this information from the parent and make it a
property of the child table. This will avoid the step for the user having
to
manually specify CREATE INDEX and the likes on all the children tables
one-by-one.

Missed the start of this thread. A while back I had intended to add WITH
INDEXES to CREATE TABLE LIKE. That would let you create a tale LIKE parent
WITH CONSTRAINTS WITH INDEXES and get pretty much a perfect table ready
for
adding to the inheritance structure.



Yeah, this one aims to do pretty much the above as part of the auto creation
of the inheritance-based partitions.

Regards,
Nikhils


--

  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com





--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Shane Ambler

NikhilS wrote:

Hi,

On 3/6/07, Peter Eisentraut [EMAIL PROTECTED] wrote:


NikhilS wrote:
 iv) Based on the PRIMARY, UNIQUE, REFERENCES information specified,
 pass it on to the children tables.

How will you maintain a primary key in such a table, considering that
indexes can't span multiple tables?



We will not (I know its a hard thing to do :) ), the intention is to use
this information from the parent and make it a property of the child table.
This will avoid the step for the user having to manually specify CREATE
INDEX and the likes on all the children tables one-by-one.


I think a way can be devised to maintain the primary key and unique 
constraints.
If a search is done on the parent table, the planner knows to rewrite 
the query as a union (or union all) of all child tables that relate to 
the where clause, or all child tables if the where clause is not on the 
column/s used to partition, then this concept should be able to be 
converted to indexes as well, so that when a primary or unique index 
from a child table is inserted to, then each of the related child 
indexes is consulted to ensure uniqueness.
This would only apply if the partitioning was not done by the primary or 
unique column.



1) Whether we should use triggers/rules for step number (iii) above.
 Maybe rules is the way to go.

Since this would basically be a case of the updatable rules problem, you
should review those discussions in the past to check whether the issues
mentioned there don't interfere with that plan.


The rules mentioned here will be to specify that all the
inserts/updates/deletes should go into proper children tables instead of 
the

parent. I do not see the updateable rules problem with regards to this, but
will check out the archives for discussion on this related to partitioning.



I would think that a trigger would be a better choice as I see the need 
(or at least the possibility) for more than just a rewrite. When a value 
is inserted that is outside of a condition currently covered by an 
existing child table then a new child will need to be spawned to contain 
the new data.


So say we partition by year and month of a date column? As new dates are 
added new child tables would be created each month. Or is this beyond 
the current plan and left to manual creation?


Will ALTER TABLE be extended to handle partitions? This will allow 
partitioning existing tables (maybe without blocking access?) and allow 
things like ALTER TABLE mytable ADD PARTITION (mydate within 200703)
and ALTER TABLE mytable DROP PARTITION (mydate within 199912) or would 
dropping be covered by DELETE FROM mytable where mydate = 199912 ?


Could such a syntax be devised for date columns? (month of mydate) or 
similar to auto create partitions based on the year and month of a date 
column? or will we just do CHECK(mydatecol = 1/3/07 and mydatecol = 
31/3/07) for each month of data? Also (day of mydatecol) to partition 
based on year and day of year.


Another syntax possibility - range(myserialcol of 50) where new 
child tables are created every 50 rows?



Maybe I'm looking at auto-maintenance which is beyond any current planning?



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Chris Browne
[EMAIL PROTECTED] (Peter Eisentraut) writes:
 NikhilS wrote:
 iv) Based on the PRIMARY, UNIQUE, REFERENCES information specified,
 pass it on to the children tables.

 How will you maintain a primary key in such a table, considering that 
 indexes can't span multiple tables?

On the one hand, I seem to recall seeing multiple-table-spanning
indices being on the TODO list.

On the other, it may be that this sort of partitioning is only usable
for scenarios where it is only needful to maintain uniqueness on a
partition-by-partition basis.

 1) Whether we should use triggers/rules for step number (iii) above.
 Maybe rules is the way to go.

 Since this would basically be a case of the updatable rules problem, you 
 should review those discussions in the past to check whether the issues 
 mentioned there don't interfere with that plan.

 2) What other attributes (access permissions e.g.) of the master
 along with the ones specified in (iv) should be passed on to the
 children.

 Moreover, how are later changes of those attributes propagated?

I hear rumour of there being a more comprehensive proposal on this in
the works...
-- 
(reverse (concatenate 'string ofni.secnanifxunil @ enworbbc))
http://linuxfinances.info/info/finances.html
Rules of  the Evil Overlord #189. I  will never tell the  hero Yes I
was the one who  did it, but you'll never be able  to prove it to that
incompetent  old fool.  Chances  are, that  incompetent  old fool  is
standing behind the curtain.  http://www.eviloverlord.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Martijn van Oosterhout
On Wed, Mar 07, 2007 at 02:43:39AM +1030, Shane Ambler wrote:
 I think a way can be devised to maintain the primary key and unique 
 constraints.
 If a search is done on the parent table, the planner knows to rewrite 
 the query as a union (or union all) of all child tables that relate to 
 the where clause, or all child tables if the where clause is not on the 
 column/s used to partition, then this concept should be able to be 
 converted to indexes as well, so that when a primary or unique index 
 from a child table is inserted to, then each of the related child 
 indexes is consulted to ensure uniqueness.

But that's where it breaks down: you not only need to check that the
row you're inserting is unique, you need to make sure that other people
trying to insert the same value see it.

After all, a unique index does contain multiple instances of any
particular value, it's just that the DB has verified that none of these
are visible to the same transaction. If there is a collision a lock is
taken that waits until the colliding insert commits or rejects.

With b-tree it's easy, every value can only appear in one place, so
it's a very simple lock. The reason why GiST doesn't support unique
indexes is due to the fact that any particular value could appear in
many places, so to stop concurrent inserts you need to lock *every*
page in the tree that another backend might want to insert the record.

I fear this will also be a problem for indexes over multiple tables, you
will need to lock the page in every index that some other DB might want
to insert the value into. All while avoiding deadlocks.

I think the problems are related. If you can solve it for multiple
tables, you can solve it for GiST indexes also.

 This would only apply if the partitioning was not done by the primary or 
 unique column.

That's the easy case, no problem there.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Florian G. Pflug

Martijn van Oosterhout wrote:

On Wed, Mar 07, 2007 at 02:43:39AM +1030, Shane Ambler wrote:
I think a way can be devised to maintain the primary key and unique 
constraints.
If a search is done on the parent table, the planner knows to rewrite 
the query as a union (or union all) of all child tables that relate to 
the where clause, or all child tables if the where clause is not on the 
column/s used to partition, then this concept should be able to be 
converted to indexes as well, so that when a primary or unique index 
from a child table is inserted to, then each of the related child 
indexes is consulted to ensure uniqueness.


But that's where it breaks down: you not only need to check that the
row you're inserting is unique, you need to make sure that other people
trying to insert the same value see it.


This sounds like what is really needed is a way to lock a certain
condition, namely the existance or non-existance of a record with
certain values in certain fields. This would not only help this case,
it would also help RI triggers, because those wouldn't have to acquire
a share lock on the referenced rows anymore.

As you pointed out, this would also make unique GiST indices possible

No real idea how to do this, though :-(

greetings, Florian Pfluge


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Peter Eisentraut
NikhilS wrote:
 We will not (I know its a hard thing to do :) ), the intention is to
 use this information from the parent and make it a property of the
 child table. This will avoid the step for the user having to manually
 specify CREATE INDEX and the likes on all the children tables
 one-by-one.

But when I say

CREATE TABLE ( a int PRIMARY KEY, ... ) PARTITION blah ...

then I expect that the primary key will be enforced across all 
partitions.  We currently sidestep that issue by not offering seemingly 
transparent partitioning.  But if you are planning to offer that, the 
unique index issue needs to be solved, and I see nothing in your plan 
about that.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Josh Berkus
Florian,

 This sounds like what is really needed is a way to lock a certain
 condition, namely the existance or non-existance of a record with
 certain values in certain fields. This would not only help this case,
 it would also help RI triggers, because those wouldn't have to acquire
 a share lock on the referenced rows anymore.

That's called predicate locking and it's very, very hard to do.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 But when I say
 CREATE TABLE ( a int PRIMARY KEY, ... ) PARTITION blah ...
 then I expect that the primary key will be enforced across all 
 partitions.  We currently sidestep that issue by not offering seemingly 
 transparent partitioning.  But if you are planning to offer that, the 
 unique index issue needs to be solved, and I see nothing in your plan 
 about that.

Agreed, it needs to Just Work.  I think it'd still be useful though
if we only support auto-partitioning on the primary key, and that
restriction avoids the indexing problem.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Josh Berkus
 Agreed, it needs to Just Work.  I think it'd still be useful though
 if we only support auto-partitioning on the primary key, and that
 restriction avoids the indexing problem.

+1

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Andrew Dunstan

Tom Lane wrote:

Peter Eisentraut [EMAIL PROTECTED] writes:
  

But when I say
CREATE TABLE ( a int PRIMARY KEY, ... ) PARTITION blah ...
then I expect that the primary key will be enforced across all 
partitions.  We currently sidestep that issue by not offering seemingly 
transparent partitioning.  But if you are planning to offer that, the 
unique index issue needs to be solved, and I see nothing in your plan 
about that.



Agreed, it needs to Just Work.  I think it'd still be useful though
if we only support auto-partitioning on the primary key, and that
restriction avoids the indexing problem.

  


Maybe. The most obvious use for automatic partitioning that I can think 
of would be based in the value of a timestamptz field rather than any 
PK. Of course I tend to work more in the OLTP field than in DW type 
apps, where other considerations might apply.


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Joris Dobbelsteen
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Josh Berkus
Sent: dinsdag 6 maart 2007 19:45
To: pgsql-hackers@postgresql.org
Cc: Florian G. Pflug; Martijn van Oosterhout; Shane Ambler; 
NikhilS; Peter Eisentraut
Subject: Re: [HACKERS] Auto creation of Partitions

Florian,

 This sounds like what is really needed is a way to lock a certain 
 condition, namely the existance or non-existance of a record with 
 certain values in certain fields. This would not only help 
this case, 
 it would also help RI triggers, because those wouldn't have 
to acquire 
 a share lock on the referenced rows anymore.

That's called predicate locking and it's very, very hard to do.

That's definitely not needed.

Rather something good such that we can finally enforce RI ourselves in
the general case. This is currently not possible to do easily, except in
C code. This means we need to look at all the rows that exists, but are
normally be invisible to our view of the database. Still I'm not sure
about all cases, as the MVCC model is quite tricky and I'm not sure
whether my idea's about it are valid.

The basic idea is that you need to guarentee the constraint for the
'single underlaying model' (with everything visible) and for your view
(under your visibility rules). I believe, but are not certain, that
under these conditions any (valid) snapshot will obey the desired
constraints.

- Joris Dobbelsteen


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread NikhilS

Hi Shane,

Maybe I'm looking at auto-maintenance which is beyond any current planning?


Many of your suggestions are useful, but auto-maintenance will be beyond the
current plan.

Regards,
Nikhils

EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread NikhilS

Hi,

On 3/7/07, Tom Lane [EMAIL PROTECTED] wrote:


Peter Eisentraut [EMAIL PROTECTED] writes:
 But when I say
 CREATE TABLE ( a int PRIMARY KEY, ... ) PARTITION blah ...
 then I expect that the primary key will be enforced across all
 partitions.  We currently sidestep that issue by not offering seemingly
 transparent partitioning.  But if you are planning to offer that, the
 unique index issue needs to be solved, and I see nothing in your plan
 about that.

Agreed, it needs to Just Work.  I think it'd still be useful though
if we only support auto-partitioning on the primary key, and that
restriction avoids the indexing problem.

regards, tom lane



Sure, but as Chris mentioned earlier, wouldn't it be useful to maintain
uniqueness on a
partition-by-partition basis too?

Regards,
Nikhils
--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Stefan Kaltenbrunner

Andrew Dunstan wrote:

Tom Lane wrote:

Peter Eisentraut [EMAIL PROTECTED] writes:
 

But when I say
CREATE TABLE ( a int PRIMARY KEY, ... ) PARTITION blah ...
then I expect that the primary key will be enforced across all 
partitions.  We currently sidestep that issue by not offering 
seemingly transparent partitioning.  But if you are planning to offer 
that, the unique index issue needs to be solved, and I see nothing in 
your plan about that.



Agreed, it needs to Just Work.  I think it'd still be useful though
if we only support auto-partitioning on the primary key, and that
restriction avoids the indexing problem.

  


Maybe. The most obvious use for automatic partitioning that I can think 
of would be based in the value of a timestamptz field rather than any 
PK. Of course I tend to work more in the OLTP field than in DW type 
apps, where other considerations might apply.


I second that - partitioning on some kind of timestamp field is a common 
usecase here too ...



Stefan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Joshua D. Drake




Maybe. The most obvious use for automatic partitioning that I can 
think of would be based in the value of a timestamptz field rather 
than any PK. Of course I tend to work more in the OLTP field than in 
DW type apps, where other considerations might apply.


I second that - partitioning on some kind of timestamp field is a 
common usecase here too ...
Partitioning period needs to work. It doesn't matter what the user 
chooses as their partition key. Timestamp is an obvious
choice but there are others such as serial where you just partition 
every million rows (for example) to keep things manageable.


Joshua D. Drake




Stefan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 6: explain analyze is your friend