Danny Do wrote:
Hi guys,

I have this problem for years but couldn't find a way to solve it.

I have a file server handling large files from 1MByte to 1GByte.
Server Info:
FreeBSD 6.2 Apache 2.2.9

DELL PowerEdge 1850
2GB RAM (only 184MB is active)
Gigabit Ethernet Connection

My server can output NO MORE than 60Mbps (read only).
The bottle neck is the hard disk. If I use ONE connection to download file
from my server, the speed can go up to about 400Mbps.
If I let visitors download using multiple connections, the server cannot
output more than 60Mbps.
My service is similar to rapidshare/megaupload, I am wondering how they
configure their servers?

If I recall correctly, it doesn't cost much time to read the data from the
disk but it does cost a lot of time to seek for the data. Correct me if I am
wrong, if I increase the read buffer size, there would be less disk seek
(disk access). Let's say the read buffer is 64K, if I increase it to 640K,
the disk seek would reduce by 90%. Thus, more data can be read from the hard

What should I do now?

Try some different webservers. Apache is great, but it is designed to
be maximally flexible and capable of doing anything you can imagine
rather than to be absolutely as fast as possible.

There are some light-weight servers which have put work into optimizing
delivery of static content -- usually spoken of in the context of serving images but any static files will be suitable material. Personally, I really like nginx for this. Lots of people go for lighttpd and there are
a number of other alternatives in ports.

Also, depending on exactly how much content you have to serve and whether
certain items are very much more popular than others, a reverse proxy / memory 
cache (a.k.a http accelerator) may help.  varnish is the obvious
candidate here, but you'll have to experiment a bit to see what the optimal
settings are and if it actually helps at all.

If your website runs using a scripting language such as PHP, then another
possibility is memcached -- although described as a cache for dynamically
generated pages, it can cache just about anything, but you will need some
sort of scripting language to interface to it from your web server.  There
are memcached APIs for all popular languages and probably a few you've never heard of...

The various caching strategies basically work because they keep recently
accessed files in RAM, avoiding an expensive round-trip to the HDD to
retrieve the data (memory access takes nano- or micro- seconds: disk accesses take milliseconds). Of course, the OS itself also does exactly the same thing in a general way, and FreeBSD is already very good in this respect. Caching software however gives you more control over what gets cached and for how long, enabling you to tune this specific application for maximum performance.



Dr Matthew J Seaman MA, D.Phil.                   7 Priory Courtyard
                                                 Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey     Ramsgate
                                                 Kent, CT11 9PW

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to