Hot Diggety! Gili was rumored to have written: > > Databases are *supposed* to be able to contain massive amounts of > data ;) I'm in favor of purging idle/old data, if at all.
I'm not sure if I will ever need to purge old data for Roller, but from past experience with other databases, often need to either purge or archive+purge, if/when things gets large enough. Yes, databases do hold large amounts of data but they're usually partitioned instead of dumping everything in single large tables. :) (At work, we run some very large databases... one takes 5 weeks to restore!) For example: Suppose you had 10 years worth of data, and indexing on a table. All that data in the same table. Can you imagine how big the index would be, and how much memory would be needed to hold the index? If big enough, may actually work against hardware memory caching properties, and slow things down a bit. Suppose you normally only really cared about 30 days worth of data, and made it a table of its own for, say, year 2005, month 12, and indexed that. It would be a MUCH smaller index, requires less memory, and faster to do updates or queries. And then archive the other 9 year 11 months of data into annual or monthly tables. Can still search them if desired. Just not all in one big table. Kind of like... if a friend asks what grades you earned in school, do you go through 10 years worth of school report cards to answer that question, or do you check the most recent report card or two? Which is more useful for the typical inquiry? Other benefits also includes... if you can break up into multiple tables, you may be able to put archived table datafiles on cheaper archive-only disk arrays and keep the main current table on faster (and more expensive) drives. Also, if you have to do a disaster recovery restore situation... it is often far better to do restores of multiple tables because you can be ready for most of your users simply by restoring a small main table first instead of having to wait hours for all the data in a huge table to finish restoring from tape, disk, or another site. I haven't personally seen Roller used in large-scale enterprise setups, but I believe Roller is scalable and well designed for small and midrange setups. With a little more design work, should be able to fit in with even large enterprise setups well. -Dan
