Hard disk bottle neck.

2008-09-28 Thread Danny Do
Hi guys,

 

I have this problem for years but couldn't find a way to solve it.

I have a file server handling large files from 1MByte to 1GByte. 

Server Info:
FreeBSD 6.2 
Apache 2.2.9

DELL PowerEdge 1850
2GB RAM (only 184MB is active)
6x300MB SCSI 10K RPM RAID5
Gigabit Ethernet Connection

My server can output NO MORE than 60Mbps (read only). 

The bottle neck is the hard disk. If I use ONE connection to download file
from my server, the speed can go up to about 400Mbps. 

If I let visitors download using multiple connections, the server cannot
output more than 60Mbps. 

My service is similar to rapidshare/megaupload, I am wondering how they
configure their servers?

If I recall correctly, it doesn't cost much time to read the data from the
disk but it does cost a lot of time to seek for the data. Correct me if I am
wrong, if I increase the read buffer size, there would be less disk seek
(disk access). Let's say the read buffer is 64K, if I increase it to 640K,
the disk seek would reduce by 90%. Thus, more data can be read from the hard
drive.

What should I do now?


Any suggestion is appreciated!

Danny Do

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Hard disk bottle neck.

2008-09-28 Thread Wojciech Puchar

Server Info:
FreeBSD 6.2
Apache 2.2.9

DELL PowerEdge 1850
2GB RAM (only 184MB is active)


so what's up with other 1.8GB?


6x300MB SCSI 10K RPM RAID5


300MB disks at 10K? there was such?


Gigabit Ethernet Connection

My server can output NO MORE than 60Mbps (read only).


you mean Mbps or MBps



The bottle neck is the hard disk. If I use ONE connection to download file
from my server, the speed can go up to about 400Mbps.

If I let visitors download using multiple connections, the server cannot
output more than 60Mbps.

My service is similar to rapidshare/megaupload, I am wondering how they
configure their servers?


patch /usr/src/sys/sys/param.h

#ifndef DFLTPHYS
#define DFLTPHYS(1024 * 1024)   /* default max raw I/O transfer size */
#endif
#ifndef MAXPHYS
#define MAXPHYS (1024 * 1024)   /* max raw I/O transfer size */
#endif
#ifndef MAXDUMPPGS




___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Hard disk bottle neck.

2008-09-28 Thread Matthew Seaman

Danny Do wrote:

Hi guys,

 


I have this problem for years but couldn't find a way to solve it.

I have a file server handling large files from 1MByte to 1GByte. 


Server Info:
FreeBSD 6.2 
Apache 2.2.9


DELL PowerEdge 1850
2GB RAM (only 184MB is active)
6x300MB SCSI 10K RPM RAID5
Gigabit Ethernet Connection

My server can output NO MORE than 60Mbps (read only). 


The bottle neck is the hard disk. If I use ONE connection to download file
from my server, the speed can go up to about 400Mbps. 


If I let visitors download using multiple connections, the server cannot
output more than 60Mbps. 


My service is similar to rapidshare/megaupload, I am wondering how they
configure their servers?

If I recall correctly, it doesn't cost much time to read the data from the
disk but it does cost a lot of time to seek for the data. Correct me if I am
wrong, if I increase the read buffer size, there would be less disk seek
(disk access). Let's say the read buffer is 64K, if I increase it to 640K,
the disk seek would reduce by 90%. Thus, more data can be read from the hard
drive.

What should I do now?


Try some different webservers. Apache is great, but it is designed to
be maximally flexible and capable of doing anything you can imagine
rather than to be absolutely as fast as possible.

There are some light-weight servers which have put work into optimizing
delivery of static content -- usually spoken of in the context of serving 
images but any static files will be suitable material.  Personally, I 
really like nginx for this.  Lots of people go for lighttpd and there are

a number of other alternatives in ports.

Also, depending on exactly how much content you have to serve and whether
certain items are very much more popular than others, a reverse proxy / memory 
cache (a.k.a http accelerator) may help.  varnish is the obvious
candidate here, but you'll have to experiment a bit to see what the optimal
settings are and if it actually helps at all.

If your website runs using a scripting language such as PHP, then another
possibility is memcached -- although described as a cache for dynamically
generated pages, it can cache just about anything, but you will need some
sort of scripting language to interface to it from your web server.  There
are memcached APIs for all popular languages and probably a few you've 
never heard of...


The various caching strategies basically work because they keep recently
accessed files in RAM, avoiding an expensive round-trip to the HDD to
retrieve the data (memory access takes nano- or micro- seconds: disk 
accesses take milliseconds).  Of course, the OS itself also does exactly 
the same thing in a general way, and FreeBSD is already very good in this 
respect.  Caching  software however gives you more control over what gets 
cached and for how long,  enabling you to tune this specific application 
for maximum performance.


Cheers,

Matthew

--
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
 Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
 Kent, CT11 9PW



signature.asc
Description: OpenPGP digital signature


Re: Hard disk bottle neck.

2008-09-28 Thread Bill Moran
Danny Do [EMAIL PROTECTED] wrote:

 Hi guys,
 
 I have this problem for years but couldn't find a way to solve it.
 
 I have a file server handling large files from 1MByte to 1GByte. 
 
 Server Info:
 FreeBSD 6.2 
 Apache 2.2.9
 
 DELL PowerEdge 1850
 2GB RAM (only 184MB is active)
 6x300MB SCSI 10K RPM RAID5
 Gigabit Ethernet Connection
 
 My server can output NO MORE than 60Mbps (read only). 
 
 The bottle neck is the hard disk.

What evidence do you have that the bottleneck is disk IO?  I've seen no
evidence, only speculation.

In addition to the advice of others, you may be able to just beef up the
RAM.  2G isn't much these days.  If you've got 200M active, you've got
about 1.8G available to cache files.  If you have repeated access of the
same file, the OS can cache that file data and not even use the disk, but
it can only do that if it has enough RAM to work with.  You need to get
your facts straight, though.  According to the specs you've got above,
you've only got 1.5G of disk.  I expect you meant 300G disks.

You could also add disks in a RAID 10, which is generally faster than
RAID 5, or move to 15,000 RPM disks.  I think you might be surprised how
much adding some RAM will help, though, unless your access patterns are
very random, RAM should speed up the access of popular data significantly.

-- 
Bill Moran
http://www.potentialtech.com
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Hard disk bottle neck.

2008-09-28 Thread Diego F. Arias R.
First be shure your bottleneck are the hard drives.

On Sun, Sep 28, 2008 at 8:54 AM, Bill Moran [EMAIL PROTECTED] wrote:
 Danny Do [EMAIL PROTECTED] wrote:

 Hi guys,

 I have this problem for years but couldn't find a way to solve it.

 I have a file server handling large files from 1MByte to 1GByte.

 Server Info:
 FreeBSD 6.2
 Apache 2.2.9

 DELL PowerEdge 1850
 2GB RAM (only 184MB is active)
 6x300MB SCSI 10K RPM RAID5
 Gigabit Ethernet Connection

 My server can output NO MORE than 60Mbps (read only).

 The bottle neck is the hard disk.

 What evidence do you have that the bottleneck is disk IO?  I've seen no
 evidence, only speculation.

 In addition to the advice of others, you may be able to just beef up the
 RAM.  2G isn't much these days.  If you've got 200M active, you've got
 about 1.8G available to cache files.  If you have repeated access of the
 same file, the OS can cache that file data and not even use the disk, but
 it can only do that if it has enough RAM to work with.  You need to get
 your facts straight, though.  According to the specs you've got above,
 you've only got 1.5G of disk.  I expect you meant 300G disks.

 You could also add disks in a RAID 10, which is generally faster than
 RAID 5, or move to 15,000 RPM disks.  I think you might be surprised how
 much adding some RAM will help, though, unless your access patterns are
 very random, RAM should speed up the access of popular data significantly.

 --
 Bill Moran
 http://www.potentialtech.com
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]




-- 
mmm, interesante.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Hard disk bottle neck.

2008-09-28 Thread Danny Do
Hi Matthew  Wojciech Puchar and others,

First of all, I'd like to correct one mistyped:
- I got 6x300GB SCSI 10K RPM hard drive.
- Most of my files are about 100MB, many as big as 1GB.
- Caching is not an option.

Thanks for the advices but caching is not an option for me as most of my
files are about 100MB, many files are as big as 1GB. 

I tried Lighty a few years ago but it doesn't help. The problem I think is
disk seek. If I can reduce disk seek by increasing read buffer, I think
problem would be solved. 

I am thinking of trying Wojciech Puchar method by patching the kernel with
the following code:

patch /usr/src/sys/sys/param.h

#ifndef DFLTPHYS
#define DFLTPHYS(1024 * 1024)   /* default max raw I/O transfer size
*/
#endif
#ifndef MAXPHYS
#define MAXPHYS (1024 * 1024)   /* max raw I/O transfer size */
#endif
#ifndef MAXDUMPPGS

I'll update the result. I'll tell you how I go. Maybe sometimes in the next
fortnight.

Thanks everyone, thanks Wojciech Puchar,

Danny


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Matthew Seaman
Sent: Sunday, 28 September 2008 7:30 PM
To: Danny Do
Cc: freebsd-questions@freebsd.org
Subject: Re: Hard disk bottle neck.

Danny Do wrote:
 Hi guys,
 
  
 
 I have this problem for years but couldn't find a way to solve it.
 
 I have a file server handling large files from 1MByte to 1GByte. 
 
 Server Info:
 FreeBSD 6.2
 Apache 2.2.9
 
 DELL PowerEdge 1850
 2GB RAM (only 184MB is active)
 6x300MB SCSI 10K RPM RAID5
 Gigabit Ethernet Connection
 
 My server can output NO MORE than 60Mbps (read only). 
 
 The bottle neck is the hard disk. If I use ONE connection to download 
 file from my server, the speed can go up to about 400Mbps.
 
 If I let visitors download using multiple connections, the server 
 cannot output more than 60Mbps.
 
 My service is similar to rapidshare/megaupload, I am wondering how 
 they configure their servers?
 
 If I recall correctly, it doesn't cost much time to read the data from 
 the disk but it does cost a lot of time to seek for the data. Correct 
 me if I am wrong, if I increase the read buffer size, there would be 
 less disk seek (disk access). Let's say the read buffer is 64K, if I 
 increase it to 640K, the disk seek would reduce by 90%. Thus, more 
 data can be read from the hard drive.
 
 What should I do now?

Try some different webservers. Apache is great, but it is designed to be
maximally flexible and capable of doing anything you can imagine rather than
to be absolutely as fast as possible.

There are some light-weight servers which have put work into optimizing
delivery of static content -- usually spoken of in the context of serving
images but any static files will be suitable material.  Personally, I really
like nginx for this.  Lots of people go for lighttpd and there are a number
of other alternatives in ports.

Also, depending on exactly how much content you have to serve and whether
certain items are very much more popular than others, a reverse proxy /
memory cache (a.k.a http accelerator) may help.  varnish is the obvious
candidate here, but you'll have to experiment a bit to see what the optimal
settings are and if it actually helps at all.

If your website runs using a scripting language such as PHP, then another
possibility is memcached -- although described as a cache for dynamically
generated pages, it can cache just about anything, but you will need some
sort of scripting language to interface to it from your web server.  There
are memcached APIs for all popular languages and probably a few you've never
heard of...

The various caching strategies basically work because they keep recently
accessed files in RAM, avoiding an expensive round-trip to the HDD to
retrieve the data (memory access takes nano- or micro- seconds: disk
accesses take milliseconds).  Of course, the OS itself also does exactly the
same thing in a general way, and FreeBSD is already very good in this
respect.  Caching  software however gives you more control over what gets
cached and for how long,  enabling you to tune this specific application for
maximum performance.

Cheers,

Matthew

-- 
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
  Kent, CT11 9PW


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Hard disk bottle neck.

2008-09-28 Thread Wojciech Puchar

the following code:

patch /usr/src/sys/sys/param.h

#ifndef DFLTPHYS
#define DFLTPHYS(1024 * 1024)   /* default max raw I/O transfer size
*/
#endif
#ifndef MAXPHYS
#define MAXPHYS (1024 * 1024)   /* max raw I/O transfer size */
#endif
#ifndef MAXDUMPPGS

I'll update the result. I'll tell you how I go. Maybe sometimes in the next
fortnight.

Thanks everyone, thanks Wojciech Puchar,


after you recompile the kernel with that patch, check your disk 
performance in some directory consisting of many large files


cd that_dir
for x in *;do (cat $x /dev/null );done

while running systat,:vmstat on another console


i've just did this on one of my systems, with ONE 500GB SATA drive and 
with geli encryption.


got 48MB/s and about 50% CPU load with core2 duo.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Hard disk bottle neck.

2008-09-28 Thread Wojciech Puchar

#ifndef MAXDUMPPGS

I'll update the result. I'll tell you how I go. Maybe sometimes in the next
fortnight.

Thanks everyone, thanks Wojciech Puchar,

Danny


anyway - how your RAID5 is configured? didn't you selected SMALL stripe 
sizes?


this way - every large read uses 3 disks in parallel, instead of spreading 
multiple reads on multiple disks.


RAID5 performance is high on reads, when configured properly, and when the 
RAID solution is right.



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Hard disk bottle neck.

2008-09-28 Thread Bill Moran
Wojciech Puchar [EMAIL PROTECTED] wrote:

 after you recompile the kernel with that patch, check your disk 
 performance in some directory consisting of many large files
 
 cd that_dir
 for x in *;do (cat $x /dev/null );done
 
 while running systat,:vmstat on another console

More specifically, do this before and after you make the change, to
demonstrate whether or not you actually fixed the problem.

-- 
Bill Moran
http://www.potentialtech.com
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Hard disk bottle neck.

2008-09-28 Thread Diego F. Arias R.
On Sun, Sep 28, 2008 at 11:17 AM, Bill Moran [EMAIL PROTECTED] wrote:
 Wojciech Puchar [EMAIL PROTECTED] wrote:

 after you recompile the kernel with that patch, check your disk
 performance in some directory consisting of many large files

 cd that_dir
 for x in *;do (cat $x /dev/null );done

 while running systat,:vmstat on another console

 More specifically, do this before and after you make the change, to
 demonstrate whether or not you actually fixed the problem.

 --
 Bill Moran
 http://www.potentialtech.com
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]


do you check gstat?

If the patch dont works, maybe yoy may try to split the raid (2 raid
5) or better use a raid 10. The raid 5 isnt a top performance raid.

-- 
mmm, interesante.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Hard disk bottle neck.

2008-09-28 Thread Wojciech Puchar


If the patch dont works, maybe yoy may try to split the raid (2 raid
5) or better use a raid 10. The raid 5 isnt a top performance raid.


properly configured RAID5 is top performing on reads
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Hard disk bottle neck.

2008-09-28 Thread Danny Do
Hi Diego,

The reason I use RAID5 because I don't want to waste too much space on 
redundancy whilst taking the advantage of read. Over 99% of disk access are 
expected to be reading. 

I could split to 2xRAID5 but I will have difficulty with file management later. 
Furthermore, the system would use 2 disks for parity. I don't want to lose too 
much space. [EMAIL PROTECTED] SCSI disks are still very expensive. :(





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Diego F. Arias R.
Sent: Sunday, 28 September 2008 11:25 PM
To: Bill Moran
Cc: Wojciech Puchar; Danny Do; freebsd-questions@freebsd.org
Subject: Re: Hard disk bottle neck.

On Sun, Sep 28, 2008 at 11:17 AM, Bill Moran [EMAIL PROTECTED] wrote:
 Wojciech Puchar [EMAIL PROTECTED] wrote:

 after you recompile the kernel with that patch, check your disk
 performance in some directory consisting of many large files

 cd that_dir
 for x in *;do (cat $x /dev/null );done

 while running systat,:vmstat on another console

 More specifically, do this before and after you make the change, to
 demonstrate whether or not you actually fixed the problem.

 --
 Bill Moran
 http://www.potentialtech.com
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]


do you check gstat?

If the patch dont works, maybe yoy may try to split the raid (2 raid
5) or better use a raid 10. The raid 5 isnt a top performance raid.

-- 
mmm, interesante.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Hard disk bottle neck.

2008-09-28 Thread Wojciech Puchar

The reason I use RAID5 because I don't want to waste too much space on 
redundancy whilst taking the advantage of read. Over 99% of disk access are 
expected to be reading.

in that case - RAID5 is perfect, just properly set up.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]