Hi Andrew, The chunking used for the message transmission over HTTP is only for transmission. And this in entirely different from the application level chunking. Without the transmission level chunking, the sending component may need to calculate the length of the data beforehand and set its length in the content-length header at the beginning. And this is typically done by buffering the entire data in memory before transmitting it. As this is inefficient, you typically use the chunking transmission mode so that the sending component can start transmuting the data without specifying the entire data length at the beginning of the transmission.
And as this chunking is about transmission, the data size of the entire message will be the same. CXF uses a file based caching for large messages. So this message is usually never loaded entirely in memory. However, some processing code (notably those using DOM based XML parsing) will load the message entirely in memory and even worse as it consumes a couple of times more than the original message size. And this is your current memory problem. The newer ws-security code in CXF 3.0.0x does not use DOM but the one in the lower versions 2.7.x/2.6.x uses DOM and affected by this memory consumption problem. If your application can work on fragmented/segmented data (or chunked as you call it), you can use that application level chunking so that the entire data will never needed to be processed at once. regards, aki 2014-03-12 23:57 GMT+01:00 Hart, Andrew B. <[email protected]>: > All, > > Also, I'd like to explain the problems we've been having and have my > understanding of "chunking" clarified. > > Now, we have some web service operations which have the potential to fetch A > LOT of data. Some of these have been written to be "asynchronous" and/or > "chunked". But, I'm not utilizing CXF features to accomplish this; I mean > that we have written two endpoints, one to request data where information is > extracted and saved off in the database with a reference / correlation id. > Then, a second endpoint is used to fetch the data using the correlation id. > > In the case of the "chunked" web services, the client also supplies a chunk > number and so we fetch the desired subset of the rows and send them back. > > Now, in the case of the chunk-encoding that CXF provides, that is at the > transport level, and I understand that it is the default for responses over > 4K. So, I can see responses coming back with... > > Wed Mar 12 17:28:57 CDT 2014:DEBUG:<< "Transfer-Encoding: chunked[\r][\n]" > > ...and I see it picking up the chunks until a zero length chunk is sent at > which time it is then GZipped, run thru WS-Security processing and the > response displayed. > > So, essentially, the transport layer chunking allows us to send larger > reponses back without the client losing the socket and/or timing out. > However, if we don't chunk at the *application* level, that means that, on > both the server and the client, we have to construct and handle larger > response. Most of my resource limitations have been encountered when > encrypting or decrypting very large responses, so transport level chunking > doesn't help with that. > > Is my understanding correct? Is chunking at the application/service > implementation level the best, or most standard, way of dealing with > problems like this? > > I'm completely skipping over my lack of understanding of creating > asynchronous web services. I understand the part a Future object to the > client, the client uses that to get the results, but I could never wrap my > head around how it is implemented on the server side, where and how does the > server persist data until the client picks it up, etc. > > Regards, > > Andrew > > > > >
