I've committed a few experiments I've been working on to: https://svn.apache.org/repos/asf/cxf/sandbox/dkulp_async_clients
Basically, I've been trying to find an async client that is somewhat usable for CXF without completely re-writing all of CXF. Not exactly an easy task. For "POST"s, they pretty much are all designed around being able to blast out pre-rendered content (like File's or byte[]). Doesn't really fit with CXF's way of streaming out the soap messages as they are created. My notes on 4 client API's I've played with: 1) Ning/Sonatype Async Client: http://sonatype.github.com/async-http-client/ (using Netty backend, didn't try the others) This one really needed pre-rendered content. When you can determine the Content-Length up front, it's really not too bad. However, once you try to flip to Chunked, I just kept running into issues. In the end, I could not find an API that would let me us Chunked that actually worked. The one that looked like it should actually didn't write the chunk headers out. No idea what's up with it. Got very frustrated with it. 2) Netty directly: https://netty.io/ I did get the simple cases working with this fairly easily. Seems quite powerful and it actually performed really well. The problem is that it's really very low level and doesn't provide a lot of things like Keep-Alive connection management "out of the box". That's something we'd have to write a lot of code to handle. Not something I wanted to really tackle. That said, for small requests, Netty was the only one to stick the HTTP headers AND the body into a single network packet. Even the URLConnection in the JDK doesn't do that. Pretty cool. 3) Jetty Client: http://wiki.eclipse.org/Jetty/Tutorial/HttpClient I did get this to work for most of my test cases. There are definitely a few "issues" that I think are likely bugs in Jetty, but easily worked around. For example, to get it to ask for chunks, you have to first return an "empty" input stream. Minor bugs asside, it did seem to work OK. However, using the chunks did expose some issues on a raw network side. Doing tcp dumps revealed that most of the chunk headers ended up in their own TCP packets so it ended up taking a bit longer due to extra packets transfered. The other thing I didn't really like was it kind of required much more byte[] copying than I really wanted. 4) Apache HTTP Components (HC)- this was the first one I tried, ran into performance issues, abandoned it to test the others, then came back to it and figured out the performance issue. :-) I had this "working", but a simple "hello world" echo in a loop resulted in VERY VERY slow operation, about 20x slower than the URLConnection in the JDK. Couldn't figure out what was going on which is why I started looking at the others. I came back to it and started doing wireshark captures and discovered that it was waiting for ACK packets whereas the other clients were not. The main issue was that the docs for how to set the TCP_NO_DELAY flag (which, to me, should be the default) seem to be more geared toward the 3.x or non-NIO versions. Anyway, once I managed to get that set, things improved significantly. For non-chunked data, it seems to be working very well. For chunked data, it seems to work well 99% of the time. It's that last 1% that's going to drive me nuts. :-( It's occassionally writing out bad chunk headers, and I have no idea why. A raw wireshark look certainly shows bad chunk headers heading out. I don't know if it's something I'm doing or a bug in their stuff. Don't really know yet. In anycase, I'm likely going to pursue option #4 a bit more and see if I can figure out the last issue with it. >From a performance standpoint, for synchronous request/response, none of them perform as well as the in-jdk HttpURLConnection for what we do. Netty came the closest at about 5% slower. HC was about 10%, Jetty about 12%. Gave up on Ning before running benchmarks. However, as exected, the real win is when you use the JAX-WS async API's to call onto a "slow" service. (I only have the HC version working for this at this point) If you tune the connection pool to allow virtually unlimmitted connections to localhost (mimics the HttpURLConnection for comparison), it's pretty awesome. I have a simple service that uses the CXF continuations to delay for about 2 seconds on the server side to mimic some processing on the server side. 5K async requests using the in-JDK stuff that we have using our default thread pool settings and such takes about 35 seconds for all the requests to complete. With HC, I have that down to about 9 seconds. However, that does require a bit of tuning of the HC defaults and settings which is the next bit of complication. Using the async apis to call a "fast" (where response is returned very quickly) service won't benefit as much. In anycase, I've committed my experiments to the sandbox. It does require the latest trunk code for transport-http as well. Any help or thoughts or anything would be more than welcome. :-) -- Daniel Kulp [email protected] - http://dankulp.com/blog Talend Community Coder - http://coders.talend.com
