Re: Thoughts on chunking....

Glen Mazza Fri, 18 Jul 2008 10:21:26 -0700

Would you say that the number of web service providers that require the
content-length field is decreasing over time?  Certainly, not too many CXF
installations would require it.


I only remember the eBay client having that problem of requiring the Content
Length field--but CXF has been fixed to output the HTML response stream when
that occurs--last year when that happened I needed to go through Wireshark
to retrieve that HTML stream in order to determine the problem.  So this
problem is getting less severe over time--both in user cluelessness and in
servers that require content-lengths.

Also, how much of a performance hit would that chunkingThreshold take up? 
Probably not too much I think.

Glen


dkulp wrote:
> 
> 
> I'm getting kind of sick of saying "turn off chunking" so I've been  
> experimenting with the benchmarks to see what we can do and also get a  
> feel for what we lose/gain with it.
> 
> First thing I've learned: you REALLY want to use the ParallelGC stuff  
> on multi-core systems.   Huge boost with that on.   (I wonder if I can  
> get the unit tests/maven using it..... Hmm.....)
> 
> Basically, I tested various messages sizes in three scenarios:
> 1) CPU bound - lots of threads sending requests so the CPU is pegged.   
> (lots of memory -Xmx1024m)
> 2) Memory bound - only a couple threads, but a low -Mx setting (I used  
> 64M)
> 3) Not bound - 2 threads (dual core client machine and dual core  
> server machine)
> 
> There are two important things to measure:
> 1) Total requests per second
> 2) Latencies
> 
> 
> Basically, by using chunking, "chunks" of the request can be sent to  
> the server and the server can start processing them while the client  
> produces more chunks.   Thus, the server can start doing the JAXB  
> deserializing of the first parts of the message while the client is  
> still using JAXB to write the last part.   The big benefit to this is  
> latencies.   The server already has deserialized most of the data by  
> the time the client is done sending it.
> 
> 
> For the unbound case (case 3):
> 
> For SMALL messages (< 2K or so) turning off chunking doesn't seem to  
> have any adverse affects.  Actually, on higher latency connections  
> (11mbit  wireless compared to gigabit), it can actually help a bit as  
> chunking tends to send an extra network packet.
> 
> However, once it gets above 8K or so, chunking starts to really help.
> 
> Once it gets up to about 24K, the difference is pretty big.   The  
> latencies are much lower so the unbound clients can send more  
> requests.  Nearly 30% higher TPS.    If your benchmark is few threads  
> pounding on the server, you really want the chunking turned on.
> 
> 
> Case 2 gets similar results.    Because the HTTPUrlConnection needs to  
> buffer the full request in the unchunked case, it puts a big load on  
> the heap and the garbage collector.  (again, parallelgc helps)    For  
> small messages, the two are comparable.   However, as the message  
> grows, the chunking helps keep the heap in better shape and puts less  
> strain on the gc.   At some point, with chunking on, the messages work  
> and with chunking off, we get OutOfMemoryErrors.  (I had messages  
> around 10M at that point)   The chunking still was working all the way  
> up to 50M.
> 
> 
> In case 1 where it's CPU bound, chunking or no chunking had very  
> little affect.   The chunking allows the server to process things  
> ahead of time, but that only really works well if the client/server  
> has cpu cycles to process it.    Actually, the chunking takes a little  
> more cpu work to decode so the non-chunked case is very slightly  
> faster (barely measurable, like 1-2%).
> 
> 
> So, where does this leave us?   I'm not sure.   We COULD add a  
> "chunkingThreashold" parameter to the http conduit client parameters,  
> defaulted to something like 4K.   Buffer up to that amount and if the  
> request completes (stream.close() called) before it's full, set the  
> content length and go non-chunked.  Once it goes over, go chunking.     
> That, would allow small messages to work with the older services.      
> The question is: will that help or make things worse?   Would we get  
> support requests like "can CXF not handle big messages?" or similar  
> when it works for the small requests, but suddenly stops working for  
> the larger requests?
> 
> Anyway, anyone else have some thoughts?
> 
> 
> ---
> Daniel Kulp
> [EMAIL PROTECTED]
> http://www.dankulp.com/blog
> 
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Thoughts-on-chunking....-tp18533011p18534074.html
Sent from the cxf-dev mailing list archive at Nabble.com.

Re: Thoughts on chunking....

Reply via email to