Re: [PERFORM] What needs to be done for real Partitioning?

Greg Stark Sun, 20 Mar 2005 15:21:35 -0800

Josh Berkus <[email protected]> writes:

> Well, I would think that specifying an expression that defines a new 
> partition 
> at each change in value (like EXTRACT(day FROM timestamp) on a time-based 
> partitioning) would cover 90% of implemenations and be a lot simpler to 
> administer.   The Oracle approach has the advantage of allowing "custom 
> paritioning" at the expense of greater complexity.


Hm. This is where I might be less helpful. Once you're submersed in one way of
doing things it can be hard to think outside the box like this.

But I fear this scheme might be harder to actually take advantage of. If I do
a query like 

 WHERE timestamp BETWEEN '2005-01-01 11:00' AND '2005-01-01 12:00'

How do you determine which partitions that range will cover?

Also, it seems like it would be inconvenient to try to construct expressions
to handle things like "start a new partition ever 1 million values".

And worse, how would you handle changing schemes with this? Like, say we want
to switch from starting one partition per month to starting one partition per
week?



I think some actual use cases might be helpful for you. I can contribute an
interesting one, though I have to be intentionally vague even though I don't
work on that system any more.

We had a table with a layout like:

txnid serial,
groupid integer,
data...

Each day a cron job created 6 new groups (actually later that was changed to
some other number). It then added a new partition to handle the range of the
new day's groups. Later another cron job exchanged out the partition from a
week earlier and exported that table, transfered it to another machine and
loaded it there.

txnid was a unique identifier but we couldn't have a unique constraint because
that would have required a global index. That didn't cause any problems since
it was a sequence generated column anyways.

We did have a unique index on <groupid,txnid> which is a local index because
groupid was the partition key. In reality nothing in our system ever really
needed a txn without knowing which group it came from anyways, so it was easy
to change our queries to take advantage of this.

We had a lot of jobs, some *extremely* performance sensitive that depended on
being able to scan the entire list of txns for a given day or a given set of
groupids. The partitions meant it could do a full table scan which made these
extremely fast.

This was with Oracle 8i. All partition keys in 8i were ranges. In 9 Oracle
added the ability to make partition reference specific id values. Sort of like
how you're describing having a key expression. We might have considered using
that scheme with groupid but then it would have meant adding a bunch of new
partitions each day and having some queries that would involve scanning
multiple partitions.

-- 
Greg


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [PERFORM] What needs to be done for real Partitioning?

Reply via email to