On 04/06/2020 16:33, Martin Grigorov wrote: > Hi team, > > I am profiling our application and I've found that Tomcat uses its own > Synchronized Queue/Stack implementations which synchronize on instance > level, i.e. their methods are 'synchronized'.
~8 years ago when I was profiling Tomcat and wrote those having something targetted to Tomcat's specific use case made a measurable improvement. I am more than prepared to believe that things have moved on since then but I'd be interested in seeing the same numbers with the latest Java 8 from Adopt OpenJDK rather than Java 14. I'd also like to see overall throughput numbers and the GC frequency for each test scenario. Mark > > The javadoc for those two classes say that they are preferred because > they are GC-friendlier than > java.uti.concurrent.ConcurrentLinkedQueue. > > I use Async Profiler > (https://github.com/jvm-profiling-tools/async-profiler) to profile CPU, > memory, > locks and few lower level types (from perf). > Issuing "./profiler.sh -e lock -d 60 PID" against 9.0.x shows such reports: > > Started [lock] profiling > > --- Execution profile --- > > Total samples : 713834 > > > > Frame buffer usage : 0.0288% > > > --- > 1106165618177 ns (58.97%), 395609 samples > > [ 0] > org.apache.tomcat.util.collections.SynchronizedStack > > [ 1] > org.apache.tomcat.util.collections.SynchronizedStack.push > > [ 2] > org.apache.tomcat.util.net.NioBlockingSelector.write > > [ 3] > org.apache.tomcat.util.net.NioSelectorPool.write > > [ 4] > org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.doWrite > > [ 5] > org.apache.tomcat.util.net.SocketWrapperBase.doWrite > > [ 6] > org.apache.tomcat.util.net.SocketWrapperBase.flushBlocking > > [ 7] > org.apache.tomcat.util.net.SocketWrapperBase.flush > > [ 8] > org.apache.coyote.http11.Http11OutputBuffer$SocketOutputBuffer.end > [ 9] org.apache.coyote.http11.filters.IdentityOutputFilter.end > > [10] org.apache.coyote.http11.Http11OutputBuffer.end > [11] org.apache.coyote.http11.Http11Processor.finishResponse > [12] org.apache.coyote.AbstractProcessor.action > [13] org.apache.coyote.Response.action > [14] org.apache.catalina.connector.OutputBuffer.close > [15] org.apache.catalina.connector.Response.finishResponse > [16] org.apache.catalina.connector.CoyoteAdapter.service > [17] org.apache.coyote.http11.Http11Processor.service > [18] org.apache.coyote.AbstractProcessorLight.process > [19] org.apache.coyote.AbstractProtocol$ConnectionHandler.process > [20] org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun > [21] org.apache.tomcat.util.net.SocketProcessorBase.run > [22] java.util.concurrent.ThreadPoolExecutor.runWorker > [23] java.util.concurrent.ThreadPoolExecutor$Worker.run > [24] org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run > [25] java.lang.Thread.run > > --- 769519847169 ns (41.03%), 314384 samples > [ 0] org.apache.tomcat.util.collections.SynchronizedStack > [ 1] org.apache.tomcat.util.collections.SynchronizedStack.pop > [ 2] org.apache.tomcat.util.net.NioBlockingSelector.write > [ 3] org.apache.tomcat.util.net.NioSelectorPool.write > [ 4] org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.doWrite > [ 5] org.apache.tomcat.util.net.SocketWrapperBase.doWrite > [ 6] org.apache.tomcat.util.net.SocketWrapperBase.flushBlocking > [ 7] org.apache.tomcat.util.net.SocketWrapperBase.flush > [ 8] org.apache.coyote.http11.Http11OutputBuffer$SocketOutputBuffer.end > [ 9] org.apache.coyote.http11.filters.IdentityOutputFilter.end > [10] org.apache.coyote.http11.Http11OutputBuffer.end > [11] org.apache.coyote.http11.Http11Processor.finishResponse > .... > > > I've made the following changes > - > https://github.com/apache/tomcat/compare/9.0.x...martin-g:feature/lock-free-queue-and-stack > and ran the test suite - all green! > > Now the output from the locks profiling looks like: > > > --- 72354435197 ns (47.98%), 4008 samples > [ 0] java.util.concurrent.locks.ReentrantLock$NonfairSync > [ 1] jdk.internal.misc.Unsafe.park > [ 2] java.util.concurrent.locks.LockSupport.park > [ 3] java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire > [ 4] > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly > [ 5] java.util.concurrent.locks.ReentrantLock$Sync.lockInterruptibly > [ 6] java.util.concurrent.locks.ReentrantLock.lockInterruptibly > [ 7] java.util.concurrent.LinkedBlockingQueue.poll > [ 8] org.apache.tomcat.util.threads.TaskQueue.poll > [ 9] org.apache.tomcat.util.threads.TaskQueue.poll > [10] java.util.concurrent.ThreadPoolExecutor.getTask > [11] java.util.concurrent.ThreadPoolExecutor.runWorker > [12] java.util.concurrent.ThreadPoolExecutor$Worker.run > [13] org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run > [14] java.lang.Thread.run > > --- 58413748017 ns (38.74%), 10179 samples > [ 0] java.lang.Object > [ 1] org.apache.catalina.core.StandardContext$ContextFilterMaps.asArray > [ 2] org.apache.catalina.core.StandardContext.findFilterMaps > [ 3] org.apache.catalina.core.ApplicationFilterFactory.createFilterChain > [ 4] org.apache.catalina.core.StandardWrapperValve.invoke > [ 5] org.apache.catalina.core.StandardContextValve.invoke > [ 6] org.apache.catalina.authenticator.AuthenticatorBase.invoke > [ 7] org.apache.catalina.core.StandardHostValve.invoke > [ 8] org.apache.catalina.valves.ErrorReportValve.invoke > [ 9] org.apache.catalina.core.StandardEngineValve.invoke > [10] org.apache.catalina.connector.CoyoteAdapter.service > [11] org.apache.coyote.http11.Http11Processor.service > [12] org.apache.coyote.AbstractProcessorLight.process > [13] org.apache.coyote.AbstractProtocol$ConnectionHandler.process > [14] org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun > [15] org.apache.tomcat.util.net.SocketProcessorBase.run > [16] java.util.concurrent.ThreadPoolExecutor.runWorker > [17] java.util.concurrent.ThreadPoolExecutor$Worker.run > [18] org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run > [19] java.lang.Thread.run > > --- 5856040617 ns (3.88%), 11395 samples > [ 0] org.apache.tomcat.util.collections.SynchronizedQueue > [ 1] org.apache.tomcat.util.collections.SynchronizedQueue.offer > [ 2] org.apache.tomcat.util.net.NioEndpoint$Poller.addEvent > [ 3] org.apache.tomcat.util.net.NioEndpoint$Poller.add > [ 4] > org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.registerReadInterest > [ 5] org.apache.coyote.AbstractProtocol$ConnectionHandler.process > ... > > > I monitor the GC both with -Xloggc:gc.log and with JMX and the patterns > are the same, no new spikes. > JDK 14 with G1GC. > > I load the application with wrk (https://github.com/wg/wrk/) and it > makes 70/80K per second. > > I guess there is a reason to use these custom synchronized data > structures but so far I don't see any drawbacks after moving to the > lock-free impls. > > Regards, > Martin > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org