[HACKERS] Named vs Unnamed Partitions

Markus Schiltknecht Tue, 08 Jan 2008 11:13:03 -0800

Hi,

IMO, the lengthy discussion about Segment Exclusion and SegmentVisibility Maps has long turned into a discussion about partitioning ingeneral. I'm thankful for all the new insights it has brought me and Iwant to continue sharing my view on things. What's following is highlytheoretical and has brainstorming characteristics. You've been warned.

There are two very distinct ways to handle partitioning. For now, I'mcalling them named and unnamed partitioning. Let's have a closer look atboth options from a users point of view. I'm using Andrew's pseudo DDLexample from the above mentioned thread:


        ALTER TABLE foo
          SET read_only='t'
          WHERE created_on < '2007-01-01';

Given all tuples were read-writeable before, that implicitly created twopartitions. Giving them names could look like that:


        ALTER TABLE foo
          SPLIT INTO old_foos AND new_foos;
          AT created_on < '2007-01-01'
        ALTER PARTITION old_foos
          SET READ ONLY;

Instead of only setting the read-only property, one could also set analternative table space for the partition:


        ALTER TABLE foo
          SET TABLE SPACE large_but_slow_storage
          WHERE created_on < '2007-01-01';

vs:

        ALTER PARTITION old_foos
          SET TABLE SPACE large_but_slow_storage;

Please also note, that neither variant is limited to range partitioning.You can theoretically partition by pretty much anything, for examplewith a WHERE clause like:


        ..WHERE (id % 5) < 2

The primary difference I see between these two ways to declarepartitions is, that the former only modifies tuple properties(read-only, storage location), while the later also tells the database*why* it has to modify them.

That has several different effects. First, newly inserted tuples aretreated differently. For unnamed partitions, there must be defaults,like read-writable and a default table space. With named partitions, youdefine split points, so I guess one expects newly inserted tuples to endup in the right partition automatically. Unnamed partitioning could beequally automatic when letting a function decide, where to insert thenew tuple.

Second, repartitioning must be treated differently. With unnamedpartitioning, the admin must first adjust the defaults (if required) andthen move the existing tuple properties accordingly. With namedpartitions, the admin only needs to adjust the split point and thedatabase system knows what it has to do.

And third, but IMO most importantly: to be able to optimize queries, thedatabase system has to know split points, so it can exclude partitionsor segments from scanning. Obviously, with named partitions, it alwaysknows them. Otherwise, you'll have to maintain some information aboutthe tuples in your partitions, as Simon does with the min/max tuples. Assoon as required, it could also maintain additional min/max values, i.e.for (id % 5) for the above example.

I hope to have shown the most relevant aspects. To conclude, I'd saythat named partitioning is closer to manually managed partitioning, asalready known and often used. While unnamed partitioning is closer toautomated partitioning, where the DBA does *not need* to have names forpartitions, which is a pretty new and interesting idea to me.


Regards

Markus


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] Named vs Unnamed Partitions

Reply via email to