On Thu, Mar 11, 2004 at 12:51:28AM -0600, Michael Parker wrote: > On Thu, Mar 11, 2004 at 07:36:07PM +1300, Sidney Markowitz wrote: > > Michael Parker wrote: > > >Is there a particular problem you are trying to solve? > > > > Yes, I'm trying to figure out why Kelsey sees the very high I/O > > requirements that he does that blocks him from scaling up to the > > multiple tens of thousands of users, while DSPAM claims to be running on > > a site with 125,000 users. > > Kelsey will have to correct me if I'm wrong, but he's not seeing the > high I/O with the MySQL bayes storage, he's seeing it with the DB_File > solution.
That is correct. But we're having trouble groking how MySQL could resolve the problems. Granted, with a small table and queries being served from chache MySQL should scream. However, with tables approaching 1TB, that obviously can't be cached effectively, the select's are going to have the hit the disks. Our estimation for load associated with a message is as follows: On average a message is broken down into 262 tokens (this is based of Sidney's mail flow) our target goal for deployment is ~2-3k msg/min capacity. For DB_file, this results in the worst case as, 25-52 mb/sec of read IO (4k read blocks * msg/sec * #tokens). Our benchmarking is pretty much in line with the theoretical numbers. This doesn't take into account database updates. Now, I'll be honest, I don't have a good understanding of how this would translate to the SQL backend. I'm installing the 3.0 snap now, and will start to play around. Hopefully I'll have results shortly. -- Kelsey Cummings - [EMAIL PROTECTED] sonic.net, inc. System Administrator 2260 Apollo Way 707.522.1000 (Voice) Santa Rosa, CA 95407 707.547.2199 (Fax) http://www.sonic.net/ Fingerprint = D5F9 667F 5D32 7347 0B79 8DB7 2B42 86B6 4E2C 3896
