Danny Do wrote:
Hi guys,I have this problem for years but couldn't find a way to solve it.I have a file server handling large files from 1MByte to 1GByte.Server Info:FreeBSD 6.2 Apache 2.2.9DELL PowerEdge 1850 2GB RAM (only 184MB is active) 6x300MB SCSI 10K RPM RAID5 Gigabit Ethernet ConnectionMy server can output NO MORE than 60Mbps (read only).The bottle neck is the hard disk. If I use ONE connection to download filefrom my server, the speed can go up to about 400Mbps.If I let visitors download using multiple connections, the server cannotoutput more than 60Mbps.My service is similar to rapidshare/megaupload, I am wondering how they configure their servers? If I recall correctly, it doesn't cost much time to read the data from the disk but it does cost a lot of time to seek for the data. Correct me if I am wrong, if I increase the read buffer size, there would be less disk seek (disk access). Let's say the read buffer is 64K, if I increase it to 640K, the disk seek would reduce by 90%. Thus, more data can be read from the hard drive. What should I do now?
Try some different webservers. Apache is great, but it is designed to be maximally flexible and capable of doing anything you can imagine rather than to be absolutely as fast as possible. There are some light-weight servers which have put work into optimizingdelivery of static content -- usually spoken of in the context of serving images but any static files will be suitable material. Personally, I really like nginx for this. Lots of people go for lighttpd and there are
a number of other alternatives in ports. Also, depending on exactly how much content you have to serve and whether certain items are very much more popular than others, a reverse proxy / memory cache (a.k.a http accelerator) may help. varnish is the obvious candidate here, but you'll have to experiment a bit to see what the optimal settings are and if it actually helps at all. If your website runs using a scripting language such as PHP, then another possibility is memcached -- although described as a cache for dynamically generated pages, it can cache just about anything, but you will need some sort of scripting language to interface to it from your web server. Thereare memcached APIs for all popular languages and probably a few you've never heard of...
The various caching strategies basically work because they keep recently accessed files in RAM, avoiding an expensive round-trip to the HDD toretrieve the data (memory access takes nano- or micro- seconds: disk accesses take milliseconds). Of course, the OS itself also does exactly the same thing in a general way, and FreeBSD is already very good in this respect. Caching software however gives you more control over what gets cached and for how long, enabling you to tune this specific application for maximum performance.
Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW
Description: OpenPGP digital signature