Before writing this mail, I'd researched a little about this topic, and got some opinions from guys like Fabien Pascal, who argues that logical design should be separated from physical design, and other sources. As this is not fact, I'm writing to you guys, that make things work in real world.
I believe he's right. Or at least that you should only compromise your logical design once it becomes absolutely necessary due to physical limitations.
We started our first big (for our company standards) project always thinking in normalization. But once we imported legacy data into the
DB, things got harder.
One example is the clients status. A client might be active, inactive or pending (for many reasons). We store all the status a client have since it is in the system. To check what is the actual status of a client, we get the last status from this historical status table. This take a considerable time, so our best results were achieved building a function that checks the status and indexing this function. The problem is that indexed functions mus bu immutable, so as you can figure, if the status change after the creation of the index, the retunr of the function is still the same.
What do you suggest for situations like this? Should I add a field to
clients table and store its actual status, keep storing data in the historical table an control its changes with a trigger?
Trigger + history table is a common solution, it's easy to implement and there's nothing non-relational about it as a solution.
There are other situations that are making things difficult to us. For example, one query must return the total amount a client bought in the last 90 days. It's taking too long, when we must know it for many clients, many times. So should I create summarization tables to store this kind of stuff, update it with a trigger in daily basis (for example), and solve this problem with one join?
One solution I use for this sort of thing is a summary table grouped by date, and accurate until the start of today. Then, I check the summary table and the "live" table for todays information and sum those.
Our database is not that big. The larger table has about 7.000.000 rows. About 50.000 clients, half of them active. All that I'd point out above uses indexes for queries, etc. But even with this it's not been fast enough. We have a Dell server for this (I know, the Dell issue), a Dual Xeon 2.8, SCSI HD, 1 GB mem. Do we need better hardware for our system?
Swap one of your processors for more RAM and disks, perhaps.
-- Richard Huxton Archonet Ltd
---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?