Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release()not invoked]

2003-01-21 Thread Remy Maucherat
Costin Manolache wrote:

Hans Bergsten wrote:



Costin Manolache wrote:


[...]
In an ideal world, all core tags would be recyclable and garbage-free -
that may allow them to run at comparable speed with a hard-coded page.


I think it's more important to implement open coding of JSTL, i.e.
generate if and for statement instead of using c:if and c:forEach
tag handlers. That would really make a difference for all apps that
use JSTL, while the potential gains from tag handler reuse depend on
a lot of factors that varies between applications and the runtime
environment.



+1 - open coding is far better ( for performance ). Is the API/model
for portable open coding defined and stable ? That would be by far the 
biggest improvement in JSP performance.

Kin-Man told me it wasn't done yet (he mentioned the curent methods 
should stay, though), unfortunately. From what he said, he needed to 
write more tag plugins to see if he was able to implement JSTL tags with 
the current API.
I agree it's definitely a good way to get around the problems with tags 
without adding too much complexity. In the end, I decided to talk about 
it a bit in a chapter of my book.
What's really funny is that you can get rid of the tag handler, and 
write only the tag plugin. That's of course, if you don't care about 
portability, and you have the tag plugin able to handle all forms of you 
tag.

But for people who use regular tag handlers - I think we need to fix
the tag pool, and a fixed tag pool will improve the performance 
significantly. And if regular tags are used, for whatever reason, 
recycling should be taken into account - if people care a bit
about performance.

+1.

Remy


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-20 Thread Peter Lin

 
I haven't read all the posts on this discussion, but here's some facts from personal 
observations.
for pages with only a few tags, ie less than 30, tag pooling doesn't help.  On the 
otherhand, if your page has 100+ tags, it improves performance. Some of the pages I 
benchmarked with had about 135 tags. In those situations, I saw a 20-50% improvement. 
I would argue that sites that don't have a lot of load should simply turn off tag 
pooling.  Site that use tags extensively and get 1millions page views a day, will gain 
significantly from tag pooling.
 
peter lin
 
 Costin Manolache [EMAIL PROTECTED] wrote:Hans Bergsten wrote:

 Without pooling With pooling Reuse w/o overhead
-
5 threads
 Avg.: 330 ms 349 ms N/A
 Rate: 15.2/sec 13.6/sec N/A

20 threads
 Avg.: 1,752 ms 1,446 ms 1,265 ms
 Rate: 12.1/sec 13.6/sec 14.7/sec

To me, this indicates that if you can avoid _all_ reuse overhead,
there's some performace to be gained from reuse but not much. With the
 
 
From 1.2s to 1.7s there is about 35% difference. I would call this
 quite significant. Even between 1.4 and 1.7 - you have 20%. Try to
 increase the thread count to 100 - and you'll see this going up.
 
 The difference ( 0.5s ) is probably 2-3 times the response time of
 apache for a static page. And most users will feel it.
 
 I agree that in percentage, the difference is somewhat significant,
 but don't make too much out of the real value. My test server is not
 representative of the type of hardware you would use for a site with
 this type of load. On hardware suitable for the task, the difference in

And the test page is not representative of the type of pages that will
run on a real site. I know that.

All we can measure with relative accuracy is the overhead of the 
container/jsp implementation - at least in relative terms. 
Take as the reference the time ( or RPS ) for Apache to serve the same
output as a static page. Or the time a servlet will take to generate
the same output. Run your tests with 5, 20, 100 RPS ( and ab may be
a better driver ). Compare the results - and most likely a production
server will see similar ratios.

I'll try to find some time ( next week - I hope ) and run the same 
tests with the no sync pool.


 the real values will likely be a lot smaller, and IMHO, insignificant.
 But please, let's not start a long debate about what's significant or
 not (that depends on too many factors). All I'm trying to show with
 these simple tests is that for pooling to really make a difference at
 all, you need to avoid all overhead, which may be very hard, and that
 the overhead with current pooling seems to eat all potential gain.

Well - it shows pretty clearly that the _current_ implementation
of thread pool is broken. Even if we don't take sync into account, the pool
has a 5 object limit - what else could you expect ??


 I ran 10,000 requests for each test case after a manual warm up (just a
 few requests to give the JIT a chance to kick in). If I rerun the tests
 to capture GC data (as Glen was asking for), I can run a longer warm-up
 as well. I didn't record the max values, but IIRC they were around 100
 sec in all cases.

The 1.4 JIT takes some time to kick in, if you run batches of 1000 requests
you'll see the time keeps improving. I would do at leat 5000 request to
warm up the jit.

 This is a very good start, thanks for bringing this up.
 
 I hope it at least gives us a better idea about what types of gains
 we can realistically expect from tag handler reuse.

Most of the improvements in coyote ( or in 3.3 over 3.2 ) are due to 
object reuse. It is possible that tag handlers are different and 
the other overheads will obscure any benefit ( at least under low load ),
but I can bet that under heavy load recycling will be very significant, if 
done correctly. 

Costin




--
To unsubscribe, e-mail: 
For additional commands, e-mail: 



-
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now


Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-20 Thread Costin Manolache
Peter Lin wrote:

 
  
 I haven't read all the posts on this discussion, but here's some facts
 from personal observations.
 for pages with only a few tags, ie less than 30, tag pooling doesn't help.
  On the otherhand, if your page has 100+ tags, it improves performance.
 Some of the pages I benchmarked with had about 135 tags. In those
 situations, I saw a 20-50% improvement. I would argue that sites that
 don't have a lot of load should simply turn off tag pooling.  Site that
 use tags extensively and get 1millions page views a day, will gain
 significantly from tag pooling.


Is this based on the current tag pool implementation in jasper2 ?
Because it is pretty clear that the tag pool has few problems. 

I would say the nature of the tags will also have a big impact. If your
tag is very simple - you'll probably get some small benefits under load
( 20..30% ?). If the tag uses internal data structures, buffers, etc - 
it's very likely you'll see more ( since creating each tag instance will
also create the additional hashtable, StringBuffers, etc ).

I would bet that with complex tags that are specifically written to take 
advantage of the recycling you would see at least 2x better performance ( 
with a good sync-free and large enough tag pool ). If your tag is using 
any buffers or complex/expensive data structures that can be recycled - 
you'll save a lot. 

I don't think the number of tags in a page is too important - even if you
have 1 complex tag - with 100 concurent users - you should see a difference.

In an ideal world, all core tags would be recyclable and garbage-free - 
that may allow them to run at comparable speed with a hard-coded page.


Costin


  
 peter lin
  
  Costin Manolache [EMAIL PROTECTED] wrote:Hans Bergsten wrote:
 
 Without pooling With pooling Reuse w/o overhead
-
5 threads
 Avg.: 330 ms 349 ms N/A
 Rate: 15.2/sec 13.6/sec N/A

20 threads
 Avg.: 1,752 ms 1,446 ms 1,265 ms
 Rate: 12.1/sec 13.6/sec 14.7/sec

To me, this indicates that if you can avoid _all_ reuse overhead,
there's some performace to be gained from reuse but not much. With the
 
 
From 1.2s to 1.7s there is about 35% difference. I would call this
 quite significant. Even between 1.4 and 1.7 - you have 20%. Try to
 increase the thread count to 100 - and you'll see this going up.
 
 The difference ( 0.5s ) is probably 2-3 times the response time of
 apache for a static page. And most users will feel it.
 
 I agree that in percentage, the difference is somewhat significant,
 but don't make too much out of the real value. My test server is not
 representative of the type of hardware you would use for a site with
 this type of load. On hardware suitable for the task, the difference in
 
 And the test page is not representative of the type of pages that will
 run on a real site. I know that.
 
 All we can measure with relative accuracy is the overhead of the
 container/jsp implementation - at least in relative terms.
 Take as the reference the time ( or RPS ) for Apache to serve the same
 output as a static page. Or the time a servlet will take to generate
 the same output. Run your tests with 5, 20, 100 RPS ( and ab may be
 a better driver ). Compare the results - and most likely a production
 server will see similar ratios.
 
 I'll try to find some time ( next week - I hope ) and run the same
 tests with the no sync pool.
 
 
 the real values will likely be a lot smaller, and IMHO, insignificant.
 But please, let's not start a long debate about what's significant or
 not (that depends on too many factors). All I'm trying to show with
 these simple tests is that for pooling to really make a difference at
 all, you need to avoid all overhead, which may be very hard, and that
 the overhead with current pooling seems to eat all potential gain.
 
 Well - it shows pretty clearly that the _current_ implementation
 of thread pool is broken. Even if we don't take sync into account, the
 pool has a 5 object limit - what else could you expect ??
 
 
 I ran 10,000 requests for each test case after a manual warm up (just a
 few requests to give the JIT a chance to kick in). If I rerun the tests
 to capture GC data (as Glen was asking for), I can run a longer warm-up
 as well. I didn't record the max values, but IIRC they were around 100
 sec in all cases.
 
 The 1.4 JIT takes some time to kick in, if you run batches of 1000
 requests you'll see the time keeps improving. I would do at leat 5000
 request to warm up the jit.
 
 This is a very good start, thanks for bringing this up.
 
 I hope it at least gives us a better idea about what types of gains
 we can realistically expect from tag handler reuse.
 
 Most of the improvements in coyote ( or in 3.3 over 3.2 ) are due to
 object reuse. It is possible that tag handlers are different and
 the other overheads will obscure any benefit ( at least under low load ),
 but I can bet that under heavy load recycling will be very 

Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-20 Thread Peter Lin

 
these were all JSTL tags.  Back when I ran the tests, I posted some of the results.  I 
did tests that were synthetic, ie out 100 JSTL out tags in one page.  Others were 
based on an actual page layout with lots of markup logic that use jstl c:choose in 
conjunction with jslt xml tags.
 
the tests were with tomcat 4.1's jasper2 and with 4.0x jasper1. obviously the tag 
pooling was only with jasper2. I didn't have time to test tomcat 3.x tag pooling.
 
peter lin
 
 Costin Manolache [EMAIL PROTECTED] wrote:Peter Lin wrote:

 
 
 I haven't read all the posts on this discussion, but here's some facts
 from personal observations.
 for pages with only a few tags, ie less than 30, tag pooling doesn't help.
 On the otherhand, if your page has 100+ tags, it improves performance.
 Some of the pages I benchmarked with had about 135 tags. In those
 situations, I saw a 20-50% improvement. I would argue that sites that
 don't have a lot of load should simply turn off tag pooling. Site that
 use tags extensively and get 1millions page views a day, will gain
 significantly from tag pooling.


Is this based on the current tag pool implementation in jasper2 ?
Because it is pretty clear that the tag pool has few problems. 

I would say the nature of the tags will also have a big impact. If your
tag is very simple - you'll probably get some small benefits under load
( 20..30% ?). If the tag uses internal data structures, buffers, etc - 
it's very likely you'll see more ( since creating each tag instance will
also create the additional hashtable, StringBuffers, etc ).

I would bet that with complex tags that are specifically written to take 
advantage of the recycling you would see at least 2x better performance ( 
with a good sync-free and large enough tag pool ). If your tag is using 
any buffers or complex/expensive data structures that can be recycled - 
you'll save a lot. 

I don't think the number of tags in a page is too important - even if you
have 1 complex tag - with 100 concurent users - you should see a difference.

In an ideal world, all core tags would be recyclable and garbage-free - 
that may allow them to run at comparable speed with a hard-coded page.


Costin




-
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now


Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release()not invoked]

2003-01-20 Thread Hans Bergsten
Costin Manolache wrote:

[...]
In an ideal world, all core tags would be recyclable and garbage-free - 
that may allow them to run at comparable speed with a hard-coded page.

I think it's more important to implement open coding of JSTL, i.e.
generate if and for statement instead of using c:if and c:forEach
tag handlers. That would really make a difference for all apps that
use JSTL, while the potential gains from tag handler reuse depend on
a lot of factors that varies between applications and the runtime
environment.

Hans
--
Hans Bergsten[EMAIL PROTECTED]
Gefion Software   http://www.gefionsoftware.com/
Author of O'Reilly's JavaServer Pages, covering JSP 1.2 and JSTL 1.0
Details athttp://TheJSPBook.com/


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-20 Thread Costin Manolache
Hans Bergsten wrote:

 Costin Manolache wrote:
 [...]
 In an ideal world, all core tags would be recyclable and garbage-free -
 that may allow them to run at comparable speed with a hard-coded page.
 
 I think it's more important to implement open coding of JSTL, i.e.
 generate if and for statement instead of using c:if and c:forEach
 tag handlers. That would really make a difference for all apps that
 use JSTL, while the potential gains from tag handler reuse depend on
 a lot of factors that varies between applications and the runtime
 environment.

+1 - open coding is far better ( for performance ). Is the API/model
for portable open coding defined and stable ? That would be by far the 
biggest improvement in JSP performance.

But for people who use regular tag handlers - I think we need to fix
the tag pool, and a fixed tag pool will improve the performance 
significantly. And if regular tags are used, for whatever reason, 
recycling should be taken into account - if people care a bit
about performance.

Costin


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-19 Thread Costin Manolache
Hans Bergsten wrote:

  Without pooling  With pooling  Reuse w/o overhead
 -
 5 threads
Avg.:  330 ms349 ms N/A
Rate:15.2/sec  13.6/sec N/A
 
 20 threads
Avg.:1,752 ms  1,446 ms1,265 ms
Rate:12.1/sec  13.6/sec14.7/sec
 
 To me, this indicates that if you can avoid _all_ reuse overhead,
 there's some performace to be gained from reuse but not much. With the

From 1.2s to 1.7s there is about 35% difference. I would call this 
quite significant. Even between 1.4 and 1.7 - you have 20%. Try to
increase the thread count to 100 - and you'll see this going up.

The difference ( 0.5s ) is probably 2-3 times the response time of
apache for a static page. And most users will feel it.

 current implementation, however, the overhead seems to kill all gains
 from creating fewer instances. I doubt increasing MAX_POOL_SIZE makes
 much of a difference.

Increasing it from the current 5 - it would make a difference. 
I agree - the ideal no overhead is harder to achieve, but I think the 
thread-local,no-sync case is close enough. 

I'll try to reproduce the test. BTW, how many requests did you make, and
what was the max response time ( max is very affected by GC ) ? I usually do 
5000 to warm up and 10.000 to run the test.

This is a very good start, thanks for bringing this up. 

Costin


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-19 Thread Glenn Nielsen
Interesting.  Your test JSP page looks like a valid test.

There is no data about GC in your tests, of course GC can happen at any time.

I would be interested in seeing the tests run with -Xincgc and -Xverbose:gc.
Then run a high enough volume of tests that a Full GC gets triggered a dozen
times or so.  I would think the GC data would be very different between the
5 thread and 20 thread tests with tag pooling enabled.  The metrics to use
might be the time spent doing GC, the number of incremental GC's, and the
number of Full GC's.

There is also no data about system CPU load. The tests show performance
from a request latency and requests per second viewpoint, but do not
necessarily show the difference in scaling.  Showing CPU load might
indicate whether one solution scales better than another.

Regards,

Glenn

Hans Bergsten wrote:

Hans Bergsten wrote:


Costin Manolache wrote:


[...]
Wow. I would be _very_ curious to see those benchmarks and the modern
JVM that was used.
All my tests ( including JDK1.4, IBM vms, GCJ ) show that reusing is 
well
worth the trouble - at least if you have 100s of requests per second
( it is not worht the trouble for very low loads ). But I'm happy to
hear that I'm wrong.


I'll try to find the figures we looked at and post them, or run a new
benchmark against TC 4.1 with and without tag handler pooling enabled.
But it may take some time, because right now I'm busy with other stuff.
If you disagree with the decision, you may want to send your feedback
to the EG: [EMAIL PROTECTED] JSP 2.0 is still just PFD.



Okay, I ran a few test cases with Tomcat 4.1.18. Benchmarks are of
course never perfect, but it should be good enough to evaluate the
difference with and without tag handler reuse.

My test server is a 1 GHz Pentium with 256 MB, with Sun's Linux
JDK 1.4.1. I ran all tests with Apache JMeter, once with 5 threads (so
the MAX_POOL_SIZE is not exceeded) and one time with 20 threads, with
and without pooling enabled.

I also hacked the servlet class generated with pooling enabled so that
there's no overhead from the reuse itself. I simple create one instance
of each tag handler at the beginning of the _jspService() method and
use this instance for all invokations. This is as efficient as it can
be, with no extra cost for synchronization or Map lookups. I ran this
test once with 20 threads.

I used this test page:

  %@ page contentType=text/plain %
  %@ taglib prefix=c uri=http://java.sun.com/jstl/core; %

  c:forEach begin=1 end=100
c:forEach begin=1 end=10
  c:out value= /
/c:forEach
  /c:forEach

While it's simple, it should show the impact of tag handler reuse.
With pooling disabled, one tag handler instance is created for the
outer c:forEach, a new one is created for the inner c:forEach
for each pass through the loop (i.e. 100 instances), and a new
instance for c:out is created for each invokation (i.e. 1000
instances). With pooling enabled, the total number of instances
depends on the number of concurrent requests. For the 5 threads tests,
it should stay close to 5 instances (although non-pooled instances
may occasionally be created and released immediately). For the 20
threads test, a lot more instances are created (since the pool is
currently limited to 5 instances), but it should still be less than
when pooling is disabled.

Okay, here are the results

Without pooling  With pooling  Reuse w/o overhead
-
5 threads
  Avg.:  330 ms349 ms N/A
  Rate:15.2/sec  13.6/sec N/A

20 threads
  Avg.:1,752 ms  1,446 ms1,265 ms
  Rate:12.1/sec  13.6/sec14.7/sec

To me, this indicates that if you can avoid _all_ reuse overhead,
there's some performace to be gained from reuse but not much. With the
current implementation, however, the overhead seems to kill all gains
from creating fewer instances. I doubt increasing MAX_POOL_SIZE makes
much of a difference.

Feel free to run the test on your platform. It could be interesting
to see some more results. Also, if you think my test page is flawed,
I'd appreciate ideas for how to improve it.

Hans





--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-19 Thread Hans Bergsten
Glenn Nielsen wrote:

Interesting.  Your test JSP page looks like a valid test.


Good.


There is no data about GC in your tests, of course GC can happen at any 
time.

I would be interested in seeing the tests run with -Xincgc and 
-Xverbose:gc.
Then run a high enough volume of tests that a Full GC gets triggered a 
dozen
times or so.  I would think the GC data would be very different between the
5 thread and 20 thread tests with tag pooling enabled.  The metrics to use
might be the time spent doing GC, the number of incremental GC's, and the
number of Full GC's.

I can rerun the tests with these options and include a sample of the
verbose:gc output. Does that help? I'm afraid I don't have the time to
summarize the GC data as you suggest, but you're welcome to do so ;-)


There is also no data about system CPU load. The tests show performance
from a request latency and requests per second viewpoint, but do not
necessarily show the difference in scaling.  Showing CPU load might
indicate whether one solution scales better than another.


Any tips about the best way to measure CPU in a meaningful way on Linux?
I know about top/gtop, vmstat, uptime etc. but they show system values
(or snapshots of the current CPU for a thread). I vaguely recall using
a command on Solaris many years ago that takes the command you want to
measure as an argument and gives you min, max and average CPU when it
completes, but I've forgot what it's called. Does it ring any bells?
If not, what do you recommend to measure CPU?

Hans


Hans Bergsten wrote:


Hans Bergsten wrote:


Costin Manolache wrote:


[...]
Wow. I would be _very_ curious to see those benchmarks and the modern
JVM that was used.
All my tests ( including JDK1.4, IBM vms, GCJ ) show that reusing is 
well
worth the trouble - at least if you have 100s of requests per second
( it is not worht the trouble for very low loads ). But I'm happy to
hear that I'm wrong.



I'll try to find the figures we looked at and post them, or run a new
benchmark against TC 4.1 with and without tag handler pooling enabled.
But it may take some time, because right now I'm busy with other stuff.
If you disagree with the decision, you may want to send your feedback
to the EG: [EMAIL PROTECTED] JSP 2.0 is still just PFD.




Okay, I ran a few test cases with Tomcat 4.1.18. Benchmarks are of
course never perfect, but it should be good enough to evaluate the
difference with and without tag handler reuse.

My test server is a 1 GHz Pentium with 256 MB, with Sun's Linux
JDK 1.4.1. I ran all tests with Apache JMeter, once with 5 threads (so
the MAX_POOL_SIZE is not exceeded) and one time with 20 threads, with
and without pooling enabled.

I also hacked the servlet class generated with pooling enabled so that
there's no overhead from the reuse itself. I simple create one instance
of each tag handler at the beginning of the _jspService() method and
use this instance for all invokations. This is as efficient as it can
be, with no extra cost for synchronization or Map lookups. I ran this
test once with 20 threads.

I used this test page:

  %@ page contentType=text/plain %
  %@ taglib prefix=c uri=http://java.sun.com/jstl/core; %

  c:forEach begin=1 end=100
c:forEach begin=1 end=10
  c:out value= /
/c:forEach
  /c:forEach

While it's simple, it should show the impact of tag handler reuse.
With pooling disabled, one tag handler instance is created for the
outer c:forEach, a new one is created for the inner c:forEach
for each pass through the loop (i.e. 100 instances), and a new
instance for c:out is created for each invokation (i.e. 1000
instances). With pooling enabled, the total number of instances
depends on the number of concurrent requests. For the 5 threads tests,
it should stay close to 5 instances (although non-pooled instances
may occasionally be created and released immediately). For the 20
threads test, a lot more instances are created (since the pool is
currently limited to 5 instances), but it should still be less than
when pooling is disabled.

Okay, here are the results

Without pooling  With pooling  Reuse w/o overhead
-
5 threads
  Avg.:  330 ms349 ms N/A
  Rate:15.2/sec  13.6/sec N/A

20 threads
  Avg.:1,752 ms  1,446 ms1,265 ms
  Rate:12.1/sec  13.6/sec14.7/sec

To me, this indicates that if you can avoid _all_ reuse overhead,
there's some performace to be gained from reuse but not much. With the
current implementation, however, the overhead seems to kill all gains
from creating fewer instances. I doubt increasing MAX_POOL_SIZE makes
much of a difference.

Feel free to run the test on your platform. It could be interesting
to see some more results. Also, if you think my test page is flawed,
I'd appreciate ideas for how to improve it.

Hans






--
To unsubscribe, e-mail:   

Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-19 Thread Hans Bergsten
Costin Manolache wrote:

Hans Bergsten wrote:



Without pooling  With pooling  Reuse w/o overhead
-
5 threads
  Avg.:  330 ms349 ms N/A
  Rate:15.2/sec  13.6/sec N/A

20 threads
  Avg.:1,752 ms  1,446 ms1,265 ms
  Rate:12.1/sec  13.6/sec14.7/sec

To me, this indicates that if you can avoid _all_ reuse overhead,
there's some performace to be gained from reuse but not much. With the




From 1.2s to 1.7s there is about 35% difference. I would call this 
quite significant. Even between 1.4 and 1.7 - you have 20%. Try to
increase the thread count to 100 - and you'll see this going up.

The difference ( 0.5s ) is probably 2-3 times the response time of
apache for a static page. And most users will feel it.


I agree that in percentage, the difference is somewhat significant,
but don't make too much out of the real value. My test server is not
representative of the type of hardware you would use for a site with
this type of load. On hardware suitable for the task, the difference in
the real values will likely be a lot smaller, and IMHO, insignificant.
But please, let's not start a long debate about what's significant or
not (that depends on too many factors). All I'm trying to show with
these simple tests is that for pooling to really make a difference at
all, you need to avoid all overhead, which may be very hard, and that
the overhead with current pooling seems to eat all potential gain.


current implementation, however, the overhead seems to kill all gains
from creating fewer instances. I doubt increasing MAX_POOL_SIZE makes
much of a difference.



Increasing it from the current 5 - it would make a difference. 
I agree - the ideal no overhead is harder to achieve, but I think the 
thread-local,no-sync case is close enough. 

I'll try to reproduce the test. BTW, how many requests did you make, and
what was the max response time ( max is very affected by GC ) ? I usually do 
5000 to warm up and 10.000 to run the test.

I ran 10,000 requests for each test case after a manual warm up (just a
few requests to give the JIT a chance to kick in). If I rerun the tests
to capture GC data (as Glen was asking for), I can run a longer warm-up
as well. I didn't record the max values, but IIRC they were around 100
sec in all cases.


This is a very good start, thanks for bringing this up. 

I hope it at least gives us a better idea about what types of gains
we can realistically expect from tag handler reuse.

Hans
--
Hans Bergsten[EMAIL PROTECTED]
Gefion Software   http://www.gefionsoftware.com/
Author of O'Reilly's JavaServer Pages, covering JSP 1.2 and JSTL 1.0
Details athttp://TheJSPBook.com/


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-19 Thread Remy Maucherat
Hans Bergsten wrote:

Costin Manolache wrote:


quite significant. Even between 1.4 and 1.7 - you have 20%. Try to
increase the thread count to 100 - and you'll see this going up.

The difference ( 0.5s ) is probably 2-3 times the response time of
apache for a static page. And most users will feel it.



I agree that in percentage, the difference is somewhat significant,
but don't make too much out of the real value. My test server is not
representative of the type of hardware you would use for a site with
this type of load. On hardware suitable for the task, the difference in
the real values will likely be a lot smaller, and IMHO, insignificant.
But please, let's not start a long debate about what's significant or
not (that depends on too many factors). All I'm trying to show with
these simple tests is that for pooling to really make a difference at
all, you need to avoid all overhead, which may be very hard, and that
the overhead with current pooling seems to eat all potential gain.


Personally, I like to think in terms of percentages, as it gives 
something to measure improvement, and anything which will give 10% of 
free performance is good. (Of course, if you don't need the extra 
performance, then good for you)

There have been a lot of changes which individually gave only a small 
performance gain when going from 4.0 to 4.1, but put everything together 
and it ends up being much faster. I also think it's great there's no 
longer any huge hotspot left in Tomcat, which would represent that much 
in (wasted) processing time. 5.0 will be no different, and will feature 
a lot of incremental performance improvements, which hopefully will be 
significant once all put together.

Remy


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]



Re: Tag reuse performance [Was: Re: DO NOT REPLY [Bug 16001] - Tag.release() not invoked]

2003-01-19 Thread Costin Manolache
Hans Bergsten wrote:

 Without pooling  With pooling  Reuse w/o overhead
-
5 threads
   Avg.:  330 ms349 ms N/A
   Rate:15.2/sec  13.6/sec N/A

20 threads
   Avg.:1,752 ms  1,446 ms1,265 ms
   Rate:12.1/sec  13.6/sec14.7/sec

To me, this indicates that if you can avoid _all_ reuse overhead,
there's some performace to be gained from reuse but not much. With the
 
 
From 1.2s to 1.7s there is about 35% difference. I would call this
 quite significant. Even between 1.4 and 1.7 - you have 20%. Try to
 increase the thread count to 100 - and you'll see this going up.
 
 The difference ( 0.5s ) is probably 2-3 times the response time of
 apache for a static page. And most users will feel it.
 
 I agree that in percentage, the difference is somewhat significant,
 but don't make too much out of the real value. My test server is not
 representative of the type of hardware you would use for a site with
 this type of load. On hardware suitable for the task, the difference in

And the test page is not representative of the type of pages that will
run on a real site.  I know that.

All we can measure with relative accuracy is the overhead of the 
container/jsp implementation - at least in relative terms. 
Take as the reference the time ( or RPS ) for Apache to serve the same
output as a static page. Or the time a servlet will take to generate
the same output. Run your tests with 5, 20, 100 RPS ( and ab may be
a better driver ). Compare the results - and most likely a production
server will see similar ratios.

I'll try to find some time ( next week - I hope ) and run the same 
tests with the no sync pool.


 the real values will likely be a lot smaller, and IMHO, insignificant.
 But please, let's not start a long debate about what's significant or
 not (that depends on too many factors). All I'm trying to show with
 these simple tests is that for pooling to really make a difference at
 all, you need to avoid all overhead, which may be very hard, and that
 the overhead with current pooling seems to eat all potential gain.

Well - it shows pretty clearly that the _current_ implementation
of thread pool is broken. Even if we don't take sync into account, the pool
has a 5 object limit - what else could you expect ??


 I ran 10,000 requests for each test case after a manual warm up (just a
 few requests to give the JIT a chance to kick in). If I rerun the tests
 to capture GC data (as Glen was asking for), I can run a longer warm-up
 as well. I didn't record the max values, but IIRC they were around 100
 sec in all cases.

The 1.4 JIT takes some time to kick in, if you run batches of 1000 requests
you'll see the time keeps improving. I would do at leat 5000 request to
warm up the jit.

 This is a very good start, thanks for bringing this up.
 
 I hope it at least gives us a better idea about what types of gains
 we can realistically expect from tag handler reuse.

Most of the improvements in coyote ( or in 3.3 over 3.2 ) are due to 
object reuse. It is possible that tag handlers are different and 
the other overheads will obscure any benefit ( at least under low load ),
but I can bet that under heavy load recycling will be very significant, if 
done correctly. 

Costin




--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]