Re: [HACKERS] Schema boggle...

Greg Stark Wed, 05 Nov 2003 22:41:23 -0800

"Marc G. Fournier" <[EMAIL PROTECTED]> writes:

> Actually, the use of schema's was my idea, to speed up some dreadfully
> slow queries dealing with traffic stats from a table that was growing
> painfully monolithic ... the idea is/was that it would be easier to
> backup/remove all data pertaining to a specific client if they decided to
> close their account ...


I would add an <aol> me too </aol> to the comment that this is a horrible idea
and will be an enormous pain. You're denormalizing your data in a way that
will cause you great pain. 

What you're trying to implement is called "partitioned tables". And you're
right, bulk loading and deleting is one of the big advantages of partitioned
tables. But Postgres doesn't support partitioned tables, so trying to roll
your own is hard.

Using schemas seems like a bad impedance match here too, as you've found out
with the search_path. Inherited tables is a closer match, still a lot of
things won't be automatic, and primary keys may be a problem.

But what most people use as the closest approximation to partitioned tables in
Postgres is partial indexes. It lets you keep your indexes to a reasonable
size but still accelerate the poor-selectivity client_id column in parallel.

But you still would be storing all the records in a single table and would
have to do some big vacuuming whenever you delete a whole client. vacuum full
may be your friend.

I don't understand why the queries should be any slower dealing with the
normalized data in a single table versus the dernormalized tables in multiple
tables. The one big exception is any query doing "where client_id = ?" where
the latter allows the use of a sequential scan instead of an index scan.

-- 
greg


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Schema boggle...

Reply via email to