Justin Erenkrantz wrote:
>On Fri, Sep 28, 2001 at 04:57:05PM -0700, Brian Pane wrote:
>
>>The notes following the profile offer my interpretation of
>>the numbers.
>>
>> % of total cumulative %
>>rank function CPU time of total CPU time
>>---- ------------------ ---------- -----------------
>> 1. munmap 18.16
>> 2. _writev 14.00
>> 3. mmap 11.96
>> 4. _so_accept 5.71
>> 5. bndm 4.82
>> 6. _close 4.76
>> 7. _so_shutdown 4.16
>> 8. __open 3.18
>> 9. _write 2.61
>>10. memset 2.38
>> 71.74
>>
>
>Most of these look related to OS things that we can't control.
>
Yes. Or, to put it another way, the implementation efficiency is
approaching its theoretical optimum for this architecture. That's
not a bad thing. :-)
>As you said, bndm is ~4x better than its predecessor. Unless
>we were to precompute mod_include files or rip out its parsing
>algorithm with something better (which I think is possible),
>I don't think there is much we can do here.
>
Agreed, there probably isn't much more that can be done. I think
2.0's SSI parser is very close to optimal now.
>I'm surprised _os_accept, _close, and _so_shutdown are up there
>in CPU usage. I also know that gstein has gone record on his
>advogato diary that mmap may not be the performance win we
>think it may be. It may be worth trying to compile without
>mmap and see how we perform. Also, I'm guessing this is on
>Solaris 8? Any chance we could get the sendfilev patch on
>there? (I know it is part of 7/01 and Sol9 as well.)
>
I'll see if we can get some more comparative numbers this week...
>Out of curiousity, how is the performance relative to 1.3
>for similar loads or URLS? -- justin
>
I just ran a quick comparison, using ab with concurrency==1
over the loopback on uniprocessor Linux (so these results show
speed, but not scalability).
50KB non-SSI HTML file: 1141 requests/s with 2.0.26-dev
948 requests/s with 1.3.20
SHTML file with 3 includes: 411 requests/s with 2.0.26-dev
(total size 50KB) 200 requests/s with 1.3.20
0KB non-SSI HTML file: 2764 requests/s with 2.0.26-dev
2937 requests/s with 1.3.20
My interpretation of these results is that 2.0's basic request
handling is a bit less efficient than 1.3, as shown on the
0KB test, but its use of sendfile for non-SSI requests and
much-improved mod_include code for SSI requests gives it an
advantage in the general case.
--Brian