We have received 3-4 copies of this email over different times - not sure what's going on.

Congratulations on reaching such a high scale in your testing. As you probably know by now that scaling isn't a straight-forward task and involves much analysis and tuning. If you're cpu is 90% utilized at 10000 users, I don't see how you can expect to get more throughput from this system ? In fact, you will probably find that beyond a certain point (say 6 to 8000 users), you will need more cpu/user as scalability drops. In any case, at such high rates, there can be lots of issues and it is difficult to preddict what exactly you may be hitting.
Shanti

On 09/24/09 19:25, Mingfan Lu wrote:
I using faban/oliophp to stress a machine (16 Core) as web server with two other DB nodes ( a master_slave cluster, master using a high speed SATA disk while slave using a SSD disk)

When #concurrent users scaling from 9K 10K 11K 12K 13K 14K 15K 16K the throughput increasing and then decreasing.
9k      *1810.323*
10K    *1969.393*
11K    *1859.053*
12K    *1842.368*
13K    *1849.213*
14K    *1843.955*

 It seems that there are some bottleneck here.

Detail to see the attached run.xml

My ramp time is 300s while steady time is 600s and the rampdown is 60s
The client start up:
    Time between starts (ms) :1
    Start simultaneously: No
    Start agents in parallel: No

But my profiling data shows that the CPU( Highest is about 80%~90% when #concurrent user is 10000, softirq% is about 14% with *4tx and 4rx* queues ) / Networks Bandwidth(70% of 1Gb) /Memory Usage/Disk are not the bottleneck. The Apache error log is very clean with no exception and error. At the same time I have disabled the static images serving (Just disable all *img* tag in the HTML)

From the pictures in http://docs.google.com/present/view?id=df7282nf_30x8gwmrch&autoStart=true <http://docs.google.com/present/view?id=df7282nf_30x8gwmrch&autoStart=true> , when 9K concurrent user, the response time is steady enough, when 10K, there is pulse lasting 600sec (what happen?) and down to very small enough in the last 300sec. I want to know what cause the strange pulse when concurrent users reach 10K?

Reply via email to