I have been a MySQL user for years, including owning a few multi-gigabyte databases for my websites, and using it to host algebra.com (about 12 GB database).
I have had my ups and downs with MySQL. The ups were ease of use and decent performance for small databases such as algebra.com. The downs were things like twenty hour REPAIR TABLE operations on a 35 GB table, etc. Right now I have a personal (one user) project to create a 5-10 Terabyte data warehouse. The largest table will consume the most space and will take, perhaps, 200,000,000 rows. I want to use it to obtain valuable business intelligence and to make money. I expect it to grow, never shrink, and to be accessed via batch queries. I do not care for batch queries to be super fast, for example an hour per query would be just fine. However, while an hour is fine, two weeks per query is NOT fine. I have a server with about 18 TB of storage and 48 GB of RAM, and 12 CPU cores. My initial plan was to use MySQL, InnoDB, and deal with problems as they arise. Perhaps, say, I would implement my own joining procedures. After reading some disparaging stuff about InnoDB performance on large datasets, however, I am getting cold feet. I have a general feeling that, perhaps, I will not be able to succeed with MySQL, or, perhaps, with either MySQL and Postgres. I do not know much about Postgres, but I am very eager to learn and see if I can use it for my purposes more effectively than MySQL. I cannot shell out $47,000 per CPU for Oracle for this project. To be more specific, the batch queries that I would do, I hope, would either use small JOINS of a small dataset to a large dataset, or just SELECTS from one big table. So... Can Postgres support a 5-10 TB database with the use pattern stated above? Thanks! i