Hi.

Kris Kennaway wrote:

After some time of running under high load disk performance become expremely poor. At that periods 'systat -vm 1' shows something like
this:
What does "high load" mean? You need to explain the system workload more.
This web service is similiar to YouTube. This server is video store. I
have around 200G of *.flv (flash video) files on the server.

I run lighttpd as a web server. Disk load is usually around 50%, network
output 100Mbit/s, 100 simultaneous connections. CPU is mostly idle.

As you can see it is a trivial service - sending files to network via HTTP.
Does lighttpd actually use HTTP accept filters?
Don't know how to make sure, but is seems to run appropriate setsockopt (truss output):

setsockopt(0x4,0xffff,0x1000,0x7fffffffe620,0x100) = 0 (0x0)

Are you using ipfilter and ipfw? You are paying a performance penalty for having them.
I'm using ipfw and one of the first rules is to pass all TCP established. ipfilter is not used on this server, but it is present in kernel as it can be used on other servers. I have 95% CPU idle, so I think packet filters does not produce significant load on this server.

You might try increasing BUCKET_MAX in sys/vm/uma_core.c. I don't really understand the code here, but you seem to be hitting a threshold behaviour where you are constantly running out of space in the per CPU caches.
Thanks, I'll try this.

This can happen if your workload is unbalanced between the CPUs and you are always allocating on one but freeing on another, but I wouldn't expect it should happen on your workload. Maybe it can also happen if your turnover is high enough.
This is very unlikely, because I have 5 another video storage servers of the same hardware and software configurations and they feel good.

On the other side, all other servers were put in production before or after problematic servers and were filled with content in the other ways and therefore they could have slightly differerent load pattern.

Totally I faced this bug three times:

1. The first time there was AFAIR 5.4-RELEASE on DELL 2850 with the same configuration as now. It was mp3 store and I used thttpd as HTTP server to serve mp3's. That time the problems were not so frequent and also it took too long to get back to normal operation so we had to reboot servers once a week or so.

The problems began when we moved to new hardware - Dell 2850. That time we suspected amrd driver and had no time to dig in, bacause all the servers of the project were problematic. Installing Linux helped.

2. The second time it was server for static files of the very popular blog. The http server was nginx and disk contented puctures, mp3's and videos. It was Dell 1850 2x146 SCSI mirror. Linux also solved the problem.

3. The problem we see now.

At first glance one can say that problem is in Dell's x850 series or amr(4), but we run this hardware on many other projects and they work well. Also Linux on them works.

And few hours ago I received feed back from Andrzej Tobola, he has the same problem on FreeBSD 7 with Promise ATA software mirror:

===
Subject: Re: amrd disk performance drop after running under high load
Date: Tue, 16 Oct 2007 10:59:34 +0200
From: Andrzej Tobola <[EMAIL PROTECTED]>
To: Alexey Popov <[EMAIL PROTECTED]>

<skip>

Exactly the same here but on big ata RAID0 with big trafic (~10GB/24h):

amper% df -h /ftp/priv
Filesystem    Size    Used   Avail Capacity  Mounted one
/dev/ar0a     744G    679G    4.7G    99%    /ftp/priv

amper% grep ^ar /var/run/dmesg.boot
ar0: 763108MB <Promise Fasttrak RAID0 (stripe 64 KB)> status: READY
ar0: disk0 READY using ad6 at ata3-master
ar0: disk1 READY using ad4 at ata2-master

amper% uname -a
FreeBSD xxx 7.0-CURRENT-200709 FreeBSD
7.0-CURRENT-200709 #0: Tue Sep 11 04:44:48 UTC 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC i386

I am rebooting if I reach this state (approx. a week).
It is old bug - a few months ;)

cheers,
-a

===

So I can conclude that FreeBSD has a long standing bug in VM that could be triggered when serving large amount of static data (much bigger than memory size) on high rates. Possibly this only applies to large files like mp3 or video.

What does vmstat -z show during the good and bad times?
I'll send this data when the bad times will happen next time.

With best regards,
Alexey Popov
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to