[
https://issues.apache.org/jira/browse/TS-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627875#comment-13627875
]
Leif Hedstrom edited comment on TS-1405 at 4/10/13 3:06 PM:
------------------------------------------------------------
I think the max being down is an artifact of less pressure on the box (since it
now can only do about 60% of the traffic it used to). I ran a few more tests,
the second one tries to reduce the pressure on the box to verify that the max
response time is due to the system being on its knees:
With this patch, and 500 connections (there's not noticeable difference, other
than mean time is 30% worse):
{code}
6378965 fetches on 580129 conns, 498 max parallel, 6.378960E+08 bytes, in 60
seconds
100 mean bytes/fetch
106315.6 fetches/sec, 1.063156E+07 bytes/sec
msecs/connect: 0.245 mean, 8.846 max, 0.042 min
msecs/first-response: 3.791 mean, 207.045 max, 0.079 min
{code}
Current master with 300 connections, but at a lower QPS (so less pressure):
{code}
8850329 fetches on 87777 conns, 300 max parallel, 8.850330E+08 bytes, in 60
seconds
100 mean bytes/fetch
147505.5 fetches/sec, 1.475055E+07 bytes/sec
msecs/connect: 0.191 mean, 2.037 max, 0.043 min
msecs/first-response: 0.678 mean, 77.340 max, 0.085 min
{code}
So even though this second test on master is doing significantly more QPS
(almost 50% more), it still has much better response response times across the
board. By reducing the throughput in this last test, such that the system
resources aren't at their limits (and probably less rescheduling on lock
contention), the response times improve. I think that's why with the patch, you
see slightly better response times on "Max", but it's really not indicative of
the patch improving anything. It's because with the patch, ATS simply can't put
the system under pressure.
This is pretty much the same problem I posted about early on here. As far as I
can tell, it's gotten noticeably worse since the first patch sets :).
was (Author: zwoop):
I think the max being down is an artifact of less pressure on the box
(since it now can only do about 60% of the traffic it used to). I ran a few
more tests, the second one tries to reduce the pressure on the box to verify
that the max response time is due to the system being on its knees:
With this patch, and 500 connections (there's not noticeable difference, other
than mean time is 30% worse):
{code}
6378965 fetches on 580129 conns, 498 max parallel, 6.378960E+08 bytes, in 60
seconds
100 mean bytes/fetch
106315.6 fetches/sec, 1.063156E+07 bytes/sec
msecs/connect: 0.245 mean, 8.846 max, 0.042 min
msecs/first-response: 3.791 mean, 207.045 max, 0.079 min
{code}
Current master with 300 connections, but at a lower QPS (so less pressure):
{code}
8850329 fetches on 87777 conns, 300 max parallel, 8.850330E+08 bytes, in 60
seconds
100 mean bytes/fetch
147505.5 fetches/sec, 1.475055E+07 bytes/sec
msecs/connect: 0.191 mean, 2.037 max, 0.043 min
msecs/first-response: 0.678 mean, 77.340 max, 0.085 min
{code}
So even though this second test on master is doing significantly more QPS
(almost 50% more), it still has much better response response times across the
board. By reducing the throughput in this last test, such that the system
resources aren't at their limits, the response times improve. I think that's
why with the patch, you see slightly better response times on "Max", but it's
really not indicative of the patch improving anything. It's because with the
patch, ATS simply can't put the system under pressure.
This is pretty much the same problem I posted about early on here. As far as I
can tell, it's gotten noticeably worse since the first patch sets :).
> apply time-wheel scheduler about event system
> ----------------------------------------------
>
> Key: TS-1405
> URL: https://issues.apache.org/jira/browse/TS-1405
> Project: Traffic Server
> Issue Type: Improvement
> Components: Core
> Affects Versions: 3.2.0
> Reporter: Bin Chen
> Assignee: Bin Chen
> Fix For: 3.3.2
>
> Attachments: linux_time_wheel.patch, linux_time_wheel_v10jp.patch,
> linux_time_wheel_v11jp.patch, linux_time_wheel_v2.patch,
> linux_time_wheel_v3.patch, linux_time_wheel_v4.patch,
> linux_time_wheel_v5.patch, linux_time_wheel_v6.patch,
> linux_time_wheel_v7.patch, linux_time_wheel_v8.patch,
> linux_time_wheel_v9jp.patch
>
>
> when have more and more event in event system scheduler, it's worse. This is
> the reason why we use inactivecop to handler keepalive. the new scheduler is
> time-wheel. It's have better time complexity(O(1))
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira