> On Feb 3, 2015, at 8:11 AM, Alessio Soldano <[email protected]> wrote:
> 
> A brief update: I've committed the workaround for http connection issue 
> (thanks Dan for the help on that!) as well as some other straightforward 
> optimizations on stuff that popped up while profiling.
> 
> Now, next "big" topic I found is the way we get and set properties in the 
> message context. We spend a relevant amount of time in creating HashMap 
> instances and especially in MessageImpl#calcContextCache, which copies 
> (putAll) all the Bus, Service, Endpoint, etc. properties into the context 
> cache. You can see [1] the cpu hotspot view I get currently with the 
> previously mentioned test app. AFAICS in the source history, there used to be 
> a different way for dealing with message properties in the past [2], then the 
> cache mechanism was added. So I'm wondering if some kind of profiling / perf 
> testing have been performed in the past and led to the changes. I might 
> simply be testing an edge scenario, with very few properties being looked up 
> and hence not justifying the caching mechanism.
> Any comment / idea / suggestion?

At one point, every “get” of a property would end up checking 4 or 5 hash maps 
which resulted in the keys being hashCoded many times, lots of checks, etc…    
When you get into the WS-Security cases and some of the HTTP configuration 
cases where there are a bunch of keys being looked up, there was a LOT of time 
being spent on the lookups.   For the most part, at the time, the maps were 
relatively small and the cost to build a single “context” map was small in 
comparison which is why this was done.   

That said, the size of the cache map is likely fairly small as well.   Maybe a 
dozen keys?  (and I'm willing to bet most of the keys are interned where a == 
would catch it)  Might be simpler to just use an Object[] or something.

Dan



> 
> Cheers
> Alessio
> 
> [1] http://pasteboard.co/QgiD4Af.png
> [2]
> 
> On 27/01/15 18:14, Alessio Soldano wrote:
>> Hi,
>> my attention has been recently brought to a scenario in which an Apache CXF 
>> client invokes an endpoint operation in a loop and the number of invocations 
>> performed in a given amount of time (say, 2 minutes) is used as benchmark 
>> for measuring WS stack performances. It's actually a very simplistic 
>> scenario, with a plain JAX-WS single thread client sending and receiving 
>> small RPC/Lit SOAP messages [1]. The reason why I've been asked to have a 
>> look is that with default settings the Apache CXF JAX-WS impl seems to 
>> perform *shamefully* bad compared to the Metro (JAX-WS RI) implementation. 
>> I've been blaming the user log configuration, etc but when I eventually 
>> tried on my own I could actually reproduce the bad results. I've been 
>> profiling a bit and found few hot spot area where CXF could possibly be 
>> optimized, but the big issue really seems to be at the HTTPCounduit / 
>> HTTPURLConnection level.
>> I found that almost all the invocations end up into 
>> sun.net.www.http.HttpClient.New(..) calling available() method [2] as part 
>> of the process for re-using cached connections [3]; that goes to the wire to 
>> try reading and takes a lot of time.
>> When the RI does the equivalent operation, the available() method is not 
>> called [4], resulting in much better performances.
>> By looking at the JDK code, it looks to me that the problem boils down to 
>> sun.net.www.protocol.http.HttpURLConnection#streaming() [5] returning 
>> different values, as a consequence of the fixedContentLenght attribute being 
>> set to a value different from -1 when running on CXF only. As a matter of 
>> fact, that is set when HTTPConduit.WrappedOutputStream#thresholdNotReached() 
>> is called, whenever a message is completely written to the outpustream 
>> buffer before the chunking threshold is reached (at least AFAIU). I've 
>> searched through the JAX-WS RI and could not find any place where 
>> setFixedLengthStreamingMode is called on the connection instead.
>> So, I've performed two quick and dirty tries: the first time I forced 
>> allowChunking = false on the client policy, the second time I commented out 
>> the code in HTTPConduit.WrappedOutputStream#thresholdNotReached(). In both 
>> cases I managed to get performances comparable to what I can get with the 
>> JAX-WS RI.
>> Now, few questions:
>> - are we really required to call setFixedLengthStreamingMode as we currently 
>> do? what's the drawback of not calling it?
>> - should we actually do something for getting decent performances by default 
>> in this scenario? (not sure expecting the user to disable chunking is that 
>> an option...)
>> As a side note, the relevant part of the JDK HttpClient code changed between 
>> JDK6 and JDK7, so things have not always been as explained above...
>> 
>> Cheers
>> Alessio
>> 
>> 
>> [1] http://www.fpaste.org/176166/14223765/
>> [2] http://pasteboard.co/FR5QVrP.png
>> [3] 
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/sun/net/www/http/HttpClient.java#276
>> [4] http://pasteboard.co/FR8okYM.png
>> [5] 
>> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/sun/net/www/protocol/http/HttpURLConnection.java#HttpURLConnection.streaming%28%29
>> 
> 
> 
> -- 
> Alessio Soldano
> Web Service Lead, JBoss
> 

-- 
Daniel Kulp
[email protected] - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com

Reply via email to