> On Feb 3, 2015, at 8:11 AM, Alessio Soldano <[email protected]> wrote: > > A brief update: I've committed the workaround for http connection issue > (thanks Dan for the help on that!) as well as some other straightforward > optimizations on stuff that popped up while profiling. > > Now, next "big" topic I found is the way we get and set properties in the > message context. We spend a relevant amount of time in creating HashMap > instances and especially in MessageImpl#calcContextCache, which copies > (putAll) all the Bus, Service, Endpoint, etc. properties into the context > cache. You can see [1] the cpu hotspot view I get currently with the > previously mentioned test app. AFAICS in the source history, there used to be > a different way for dealing with message properties in the past [2], then the > cache mechanism was added. So I'm wondering if some kind of profiling / perf > testing have been performed in the past and led to the changes. I might > simply be testing an edge scenario, with very few properties being looked up > and hence not justifying the caching mechanism. > Any comment / idea / suggestion?
At one point, every “get” of a property would end up checking 4 or 5 hash maps which resulted in the keys being hashCoded many times, lots of checks, etc… When you get into the WS-Security cases and some of the HTTP configuration cases where there are a bunch of keys being looked up, there was a LOT of time being spent on the lookups. For the most part, at the time, the maps were relatively small and the cost to build a single “context” map was small in comparison which is why this was done. That said, the size of the cache map is likely fairly small as well. Maybe a dozen keys? (and I'm willing to bet most of the keys are interned where a == would catch it) Might be simpler to just use an Object[] or something. Dan > > Cheers > Alessio > > [1] http://pasteboard.co/QgiD4Af.png > [2] > > On 27/01/15 18:14, Alessio Soldano wrote: >> Hi, >> my attention has been recently brought to a scenario in which an Apache CXF >> client invokes an endpoint operation in a loop and the number of invocations >> performed in a given amount of time (say, 2 minutes) is used as benchmark >> for measuring WS stack performances. It's actually a very simplistic >> scenario, with a plain JAX-WS single thread client sending and receiving >> small RPC/Lit SOAP messages [1]. The reason why I've been asked to have a >> look is that with default settings the Apache CXF JAX-WS impl seems to >> perform *shamefully* bad compared to the Metro (JAX-WS RI) implementation. >> I've been blaming the user log configuration, etc but when I eventually >> tried on my own I could actually reproduce the bad results. I've been >> profiling a bit and found few hot spot area where CXF could possibly be >> optimized, but the big issue really seems to be at the HTTPCounduit / >> HTTPURLConnection level. >> I found that almost all the invocations end up into >> sun.net.www.http.HttpClient.New(..) calling available() method [2] as part >> of the process for re-using cached connections [3]; that goes to the wire to >> try reading and takes a lot of time. >> When the RI does the equivalent operation, the available() method is not >> called [4], resulting in much better performances. >> By looking at the JDK code, it looks to me that the problem boils down to >> sun.net.www.protocol.http.HttpURLConnection#streaming() [5] returning >> different values, as a consequence of the fixedContentLenght attribute being >> set to a value different from -1 when running on CXF only. As a matter of >> fact, that is set when HTTPConduit.WrappedOutputStream#thresholdNotReached() >> is called, whenever a message is completely written to the outpustream >> buffer before the chunking threshold is reached (at least AFAIU). I've >> searched through the JAX-WS RI and could not find any place where >> setFixedLengthStreamingMode is called on the connection instead. >> So, I've performed two quick and dirty tries: the first time I forced >> allowChunking = false on the client policy, the second time I commented out >> the code in HTTPConduit.WrappedOutputStream#thresholdNotReached(). In both >> cases I managed to get performances comparable to what I can get with the >> JAX-WS RI. >> Now, few questions: >> - are we really required to call setFixedLengthStreamingMode as we currently >> do? what's the drawback of not calling it? >> - should we actually do something for getting decent performances by default >> in this scenario? (not sure expecting the user to disable chunking is that >> an option...) >> As a side note, the relevant part of the JDK HttpClient code changed between >> JDK6 and JDK7, so things have not always been as explained above... >> >> Cheers >> Alessio >> >> >> [1] http://www.fpaste.org/176166/14223765/ >> [2] http://pasteboard.co/FR5QVrP.png >> [3] >> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/sun/net/www/http/HttpClient.java#276 >> [4] http://pasteboard.co/FR8okYM.png >> [5] >> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/sun/net/www/protocol/http/HttpURLConnection.java#HttpURLConnection.streaming%28%29 >> > > > -- > Alessio Soldano > Web Service Lead, JBoss > -- Daniel Kulp [email protected] - http://dankulp.com/blog Talend Community Coder - http://coders.talend.com
