On 04/06/2020 16:33, Martin Grigorov wrote:
> Hi team,
> 
> I am profiling our application and I've found that Tomcat uses its own
> Synchronized Queue/Stack implementations which synchronize on instance
> level, i.e. their methods are 'synchronized'.

~8 years ago when I was profiling Tomcat and wrote those having
something targetted to Tomcat's specific use case made a measurable
improvement.

I am more than prepared to believe that things have moved on since then
but I'd be interested in seeing the same numbers with the latest Java 8
from Adopt OpenJDK rather than Java 14.

I'd also like to see overall throughput numbers and the GC frequency for
each test scenario.

Mark



> 
> The javadoc for those two classes say that they are preferred because
> they are GC-friendlier than
> java.uti.concurrent.ConcurrentLinkedQueue.
> 
> I use Async Profiler
> (https://github.com/jvm-profiling-tools/async-profiler) to profile CPU,
> memory,
> locks and few lower level types (from perf).
> Issuing "./profiler.sh -e lock -d 60 PID" against 9.0.x shows such reports:
> 
> Started [lock] profiling                                               
>                                                                      
>  --- Execution profile ---                                             
>                                                                      
>  Total samples       : 713834                                           
>                                                                        
>                                                                        
>                                                                        
>   Frame buffer usage  : 0.0288%                                         
>                                                                        
>                                                                        
>                                                                     ---
> 1106165618177 ns (58.97%), 395609 samples                              
>                                                                        
>                                                              [ 0]
> org.apache.tomcat.util.collections.SynchronizedStack                    
>                                                                        
>                                                          [ 1]
> org.apache.tomcat.util.collections.SynchronizedStack.push              
>                                                                        
>                                                           [ 2]
> org.apache.tomcat.util.net.NioBlockingSelector.write                    
>                                                                        
>                                                          [ 3]
> org.apache.tomcat.util.net.NioSelectorPool.write                        
>                                                                        
>                                                          [ 4]
> org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.doWrite        
>                                                                        
>                                                           [ 5]
> org.apache.tomcat.util.net.SocketWrapperBase.doWrite                    
>                                                                        
>                                                          [ 6]
> org.apache.tomcat.util.net.SocketWrapperBase.flushBlocking              
>                                                                        
>                                                          [ 7]
> org.apache.tomcat.util.net.SocketWrapperBase.flush                      
>                                                                        
>                                                          [ 8]
> org.apache.coyote.http11.Http11OutputBuffer$SocketOutputBuffer.end      
>   [ 9] org.apache.coyote.http11.filters.IdentityOutputFilter.end        
>                                                    
>   [10] org.apache.coyote.http11.Http11OutputBuffer.end              
>   [11] org.apache.coyote.http11.Http11Processor.finishResponse      
>   [12] org.apache.coyote.AbstractProcessor.action
>   [13] org.apache.coyote.Response.action
>   [14] org.apache.catalina.connector.OutputBuffer.close
>   [15] org.apache.catalina.connector.Response.finishResponse
>   [16] org.apache.catalina.connector.CoyoteAdapter.service
>   [17] org.apache.coyote.http11.Http11Processor.service
>   [18] org.apache.coyote.AbstractProcessorLight.process
>   [19] org.apache.coyote.AbstractProtocol$ConnectionHandler.process
>   [20] org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun
>   [21] org.apache.tomcat.util.net.SocketProcessorBase.run
>   [22] java.util.concurrent.ThreadPoolExecutor.runWorker
>   [23] java.util.concurrent.ThreadPoolExecutor$Worker.run
>   [24] org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run
>   [25] java.lang.Thread.run
> 
> --- 769519847169 ns (41.03%), 314384 samples
>   [ 0] org.apache.tomcat.util.collections.SynchronizedStack
>   [ 1] org.apache.tomcat.util.collections.SynchronizedStack.pop
>   [ 2] org.apache.tomcat.util.net.NioBlockingSelector.write
>   [ 3] org.apache.tomcat.util.net.NioSelectorPool.write
>   [ 4] org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.doWrite
>   [ 5] org.apache.tomcat.util.net.SocketWrapperBase.doWrite
>   [ 6] org.apache.tomcat.util.net.SocketWrapperBase.flushBlocking
>   [ 7] org.apache.tomcat.util.net.SocketWrapperBase.flush
>   [ 8] org.apache.coyote.http11.Http11OutputBuffer$SocketOutputBuffer.end
>   [ 9] org.apache.coyote.http11.filters.IdentityOutputFilter.end
>   [10] org.apache.coyote.http11.Http11OutputBuffer.end
>   [11] org.apache.coyote.http11.Http11Processor.finishResponse 
>    ....
> 
> 
> I've made the following changes
> - 
> https://github.com/apache/tomcat/compare/9.0.x...martin-g:feature/lock-free-queue-and-stack
> and ran the test suite - all green!
> 
> Now the output from the locks profiling looks like:
> 
> 
> --- 72354435197 ns (47.98%), 4008 samples
>   [ 0] java.util.concurrent.locks.ReentrantLock$NonfairSync
>   [ 1] jdk.internal.misc.Unsafe.park
>   [ 2] java.util.concurrent.locks.LockSupport.park
>   [ 3] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire
>   [ 4]
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly
>   [ 5] java.util.concurrent.locks.ReentrantLock$Sync.lockInterruptibly
>   [ 6] java.util.concurrent.locks.ReentrantLock.lockInterruptibly
>   [ 7] java.util.concurrent.LinkedBlockingQueue.poll
>   [ 8] org.apache.tomcat.util.threads.TaskQueue.poll
>   [ 9] org.apache.tomcat.util.threads.TaskQueue.poll
>   [10] java.util.concurrent.ThreadPoolExecutor.getTask
>   [11] java.util.concurrent.ThreadPoolExecutor.runWorker
>   [12] java.util.concurrent.ThreadPoolExecutor$Worker.run
>   [13] org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run
>   [14] java.lang.Thread.run
> 
> --- 58413748017 ns (38.74%), 10179 samples
>   [ 0] java.lang.Object
>   [ 1] org.apache.catalina.core.StandardContext$ContextFilterMaps.asArray
>   [ 2] org.apache.catalina.core.StandardContext.findFilterMaps
>   [ 3] org.apache.catalina.core.ApplicationFilterFactory.createFilterChain
>   [ 4] org.apache.catalina.core.StandardWrapperValve.invoke
>   [ 5] org.apache.catalina.core.StandardContextValve.invoke
>   [ 6] org.apache.catalina.authenticator.AuthenticatorBase.invoke
>   [ 7] org.apache.catalina.core.StandardHostValve.invoke
>   [ 8] org.apache.catalina.valves.ErrorReportValve.invoke
>   [ 9] org.apache.catalina.core.StandardEngineValve.invoke
>   [10] org.apache.catalina.connector.CoyoteAdapter.service
>   [11] org.apache.coyote.http11.Http11Processor.service
>   [12] org.apache.coyote.AbstractProcessorLight.process
>   [13] org.apache.coyote.AbstractProtocol$ConnectionHandler.process
>   [14] org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun
>   [15] org.apache.tomcat.util.net.SocketProcessorBase.run
>   [16] java.util.concurrent.ThreadPoolExecutor.runWorker
>   [17] java.util.concurrent.ThreadPoolExecutor$Worker.run
>   [18] org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run
>   [19] java.lang.Thread.run
> 
> --- 5856040617 ns (3.88%), 11395 samples
>   [ 0] org.apache.tomcat.util.collections.SynchronizedQueue
>   [ 1] org.apache.tomcat.util.collections.SynchronizedQueue.offer
>   [ 2] org.apache.tomcat.util.net.NioEndpoint$Poller.addEvent
>   [ 3] org.apache.tomcat.util.net.NioEndpoint$Poller.add
>   [ 4]
> org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.registerReadInterest
>   [ 5] org.apache.coyote.AbstractProtocol$ConnectionHandler.process
>   ...
> 
> 
> I monitor the GC both with -Xloggc:gc.log and with JMX and the patterns
> are the same, no new spikes. 
> JDK 14 with G1GC.
> 
> I load the application with wrk (https://github.com/wg/wrk/) and it
> makes 70/80K per second.
> 
> I guess there is a reason to use these custom synchronized data
> structures but so far I don't see any drawbacks after moving to the
> lock-free impls.
> 
> Regards,
> Martin
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to