Hot Diggety! Gili was rumored to have written:
> 
>       Databases are *supposed* to be able to contain massive amounts of 
>       data ;) I'm in favor of purging idle/old data, if at all.

I'm not sure if I will ever need to purge old data for Roller, but from
past experience with other databases, often need to either purge or
archive+purge, if/when things gets large enough.

Yes, databases do hold large amounts of data but they're usually
partitioned instead of dumping everything in single large tables. :)

(At work, we run some very large databases... one takes 5 weeks to restore!)

For example:

Suppose you had 10 years worth of data, and indexing on a table. All
that data in the same table. Can you imagine how big the index would be,
and how much memory would be needed to hold the index? If big enough,
may actually work against hardware memory caching properties, and slow
things down a bit.

Suppose you normally only really cared about 30 days worth of data, and
made it a table of its own for, say, year 2005, month 12, and indexed
that. It would be a MUCH smaller index, requires less memory, and faster
to do updates or queries.

And then archive the other 9 year 11 months of data into annual or
monthly tables. Can still search them if desired. Just not all in one
big table.

Kind of like... if a friend asks what grades you earned in school, do
you go through 10 years worth of school report cards to answer that
question, or do you check the most recent report card or two? Which is
more useful for the typical inquiry?

Other benefits also includes... if you can break up into multiple
tables, you may be able to put archived table datafiles on cheaper
archive-only disk arrays and keep the main current table on faster
(and more expensive) drives.

Also, if you have to do a disaster recovery restore situation... it is
often far better to do restores of multiple tables because you can be
ready for most of your users simply by restoring a small main table
first instead of having to wait hours for all the data in a huge table
to finish restoring from tape, disk, or another site.

I haven't personally seen Roller used in large-scale enterprise setups,
but I believe Roller is scalable and well designed for small and
midrange setups. With a little more design work, should be able to fit
in with even large enterprise setups well.

-Dan

Reply via email to