partitioning star schema

becker . bill Fri, 10 Jan 2003 06:59:57 -0800

Hello,

We are still struggling with partitioning of star schema fact tables.
As of yet, we haven't been able to test/compare any of the following
scenarios (because we're not yet legal with the partitioning option),
so I am posting in the hope that someone with more partitioning
experience will comment. This will eventually be implemented on 9.2
on Solaris.


By star schema fact tables, I am referring to tables that consist mostly
of surrogate key id fields (used for joining to dimension tables),
and numeric fields containing a quantity measure.

The id fields are never directly referenced in WHERE clauses of queries as
*filter* conditions, but are frequently referenced in join conditions. The
filter conditions usually reference fields in one of the dimension tables
joined to be the fact table.

We have developed some (untested) practical guidelines for partitioning. 
They are listed from best to worst. These are intended to optimize querying
(not the incremental loading), and they apply to tables rather than
indexes. (We are creating a similar list for indexes) Here they are:

1) partition by a field most frequently referenced in the WHERE clause
   as a filter condition; subpartition by a field less frequently
   referenced as a filter condition. This enables a double partition-pruning.
2) partition by a field most frequently referenced in the WHERE clause
   as a filter condition; subpartition by a field frequently
   referenced as a join condition, where the joined-to table is 
   partitioned exactly the same way. This enables partition-pruning
   and partition-wise joins.
3) partition by a field in the table that is often referenced
   in WHERE clauses as a filter condition; this enables partition
   pruning.
4) partition by a frequently-used join field where the joined-to table
   is partitioned exactly the same; this enables partition-wise joins.
5) partition by a frequently-used join field. 
6) partition by something is usually better than not partitioning at all.

In many cases, we have to go all the way to #5 before this applies. As I said,
the fact table id fields are never referenced in WHERE clauses, the dimension
tables are rarely large enough to be partitioned, and the WHERE filter conditions
usually apply to a dimension table, so we wind up partitioning by an id field
frequently used in a join clause. This id field is often a date_id field,
which is used to join to a dates dimension table, because a date range is
frequently used as a filter condition in queries.

Questions:
1) Do you agree with the ranking above?
2) Is there any substantial benefit to partitioning a fact table by an id
   field, when the id field is used to join to a non-partitioned dimension
   table which is referenced in a filter condition?

Thanks to all who made it this far. 
More thanks to any responders.
Most thanks to those with helpful comments.
All-thanked-out, Bill.

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: 
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

partitioning star schema

Reply via email to