Earl is correct. MHonArc computation is not the bottleneck, disk space is. Latency is still under 24 hours if things run at full tilt, but that doesn't happen when your disk I/O system is spending ages searching for the few free inodes amongst the billions already used.
We do use a lot of CPU -- split evenly between MHonArc and HtDig indexing. Disk space is also split evenly between HTML and MHonArc index files (raw mail has long since been offloaded from the system). I've considered data compression on unchanging archives (i.e. mod_gzip on the HTML pages) but that just trades disk for CPU resources, and complicates the software. Can do it on active archives or htdig indexing becomes expensive. > Should there be message expiration [?] Yeah, I think we've hit the runaway success point where it is helpful. I think I am going to limit maximum archive size to a few thousand messages after the new hardware is in place and there is some breathing room. > What mail-archive.com is experiencing now is the resource limitations > that a single individual can provide. It appears mail-archive has > grown bigger than what Jeff ever thought it would. Since there are > several open source projects that utilize the service, it would be > nice that some contribution, like in resources, were provided to > mail-archive to avoid problems like the current situation. That's fundamentally the issue. I'm at the very edge of what I can provide as a hobbyist considering the next generation hardware that makes sense stores 1 to 2 TB (at roughly $6K/TB) and I still have no good place to put such a machine on the net. -Jeff _______________________________________________ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip