On L, 2005-03-19 at 12:02 -0800, Josh Berkus wrote: > Folks, > > I may (or may not) soon have funding for implementing full table partitioning > in PostgreSQL.
If you don't get it, contact me as there is a small possibility that I know a company interested enough to fund (some) of it :) > I thought it would be a good idea to discuss with people here > who are already using pseudo-partitioning what things need to be added to > Postgresql in order to make full paritioning a reality; that is, what do > other databases do that we don't? As these are already discussed in this thread, I'll try to outline a method of providing a global index (unique or not) in a way that will still make it possible to quickly remove (and not-quite-so-quickly add) a partition. The structure is inspired by the current way of handling >1Gb tables. As each tid consists of 32 bit page pointer we have pointerspace of 35184372088832 bytes/index (4G of 8k pages). currently this is directly partitioned mapped to 1Gbyte/128kpage files, but we can, with minimal changes to indexes, put a lookup table between index and page lookup. In case of global index over partitions this table could point to 1G subtables from different partition tables. The drop partition table can also be fast - just record the pages in lookup table as deleted - means one change per 1G of dropped table. The next vacuum should free pointers to deleted subfiles. Adding partitions is trickier - If the added table forms part of partitioning index (say names from C to E), and there is a matching index on subtable, Then that part of btree can probably copied into the main btree index as a tree btanch, which should be relatively fast (compared to building it one tid at a time). Else adding the the index could probably also be sped up by some kind of index merge - faster than building from scratch but slower than above. To repeat - the global index over partitioned table should have te same structure as our current b-tree index, only with added map of 128k index partitions to 1G subfiles of (possibly different) tables. This map will be quite small - for 1Tb of data it will be only 1k entries - this will fit in cache on all modern processors and thus should add only tiny slowdown from current direct tid.page/128k method -- Hannu Krosing <[EMAIL PROTECTED]> ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]