Thoughts on chunking....

Daniel Kulp Fri, 18 Jul 2008 09:28:25 -0700

I'm getting kind of sick of saying "turn off chunking" so I've beenexperimenting with the benchmarks to see what we can do and also get afeel for what we lose/gain with it.

First thing I've learned: you REALLY want to use the ParallelGC stuffon multi-core systems. Huge boost with that on. (I wonder if I canget the unit tests/maven using it..... Hmm.....)


Basically, I tested various messages sizes in three scenarios:

1) CPU bound - lots of threads sending requests so the CPU is pegged.(lots of memory -Xmx1024m)2) Memory bound - only a couple threads, but a low -Mx setting (I used64M)3) Not bound - 2 threads (dual core client machine and dual coreserver machine)


There are two important things to measure:
1) Total requests per second
2) Latencies

Basically, by using chunking, "chunks" of the request can be sent tothe server and the server can start processing them while the clientproduces more chunks. Thus, the server can start doing the JAXBdeserializing of the first parts of the message while the client isstill using JAXB to write the last part. The big benefit to this islatencies. The server already has deserialized most of the data bythe time the client is done sending it.



For the unbound case (case 3):

For SMALL messages (< 2K or so) turning off chunking doesn't seem tohave any adverse affects. Actually, on higher latency connections(11mbit wireless compared to gigabit), it can actually help a bit aschunking tends to send an extra network packet.


However, once it gets above 8K or so, chunking starts to really help.

Once it gets up to about 24K, the difference is pretty big. Thelatencies are much lower so the unbound clients can send morerequests. Nearly 30% higher TPS. If your benchmark is few threadspounding on the server, you really want the chunking turned on.

Case 2 gets similar results. Because the HTTPUrlConnection needs tobuffer the full request in the unchunked case, it puts a big load onthe heap and the garbage collector. (again, parallelgc helps) Forsmall messages, the two are comparable. However, as the messagegrows, the chunking helps keep the heap in better shape and puts lessstrain on the gc. At some point, with chunking on, the messages workand with chunking off, we get OutOfMemoryErrors. (I had messagesaround 10M at that point) The chunking still was working all the wayup to 50M.

In case 1 where it's CPU bound, chunking or no chunking had verylittle affect. The chunking allows the server to process thingsahead of time, but that only really works well if the client/serverhas cpu cycles to process it. Actually, the chunking takes a littlemore cpu work to decode so the non-chunked case is very slightlyfaster (barely measurable, like 1-2%).

So, where does this leave us? I'm not sure. We COULD add a"chunkingThreashold" parameter to the http conduit client parameters,defaulted to something like 4K. Buffer up to that amount and if therequest completes (stream.close() called) before it's full, set thecontent length and go non-chunked. Once it goes over, go chunking.That, would allow small messages to work with the older services.The question is: will that help or make things worse? Would we getsupport requests like "can CXF not handle big messages?" or similarwhen it works for the small requests, but suddenly stops working forthe larger requests?


Anyway, anyone else have some thoughts?


---
Daniel Kulp
[EMAIL PROTECTED]
http://www.dankulp.com/blog

Thoughts on chunking....

Reply via email to