Re: Another question re Chunking

Aki Yoshida Thu, 13 Mar 2014 02:56:25 -0700

Hi Andrew,
The chunking used for the message transmission over HTTP is only for
transmission. And this in entirely different from the application
level chunking. Without the transmission level chunking, the sending
component may need to calculate the length of the data beforehand and
set its length in the content-length header at the beginning. And this
is typically done by buffering the entire data in memory before
transmitting it. As this is inefficient, you typically use the
chunking transmission mode so that the sending component can start
transmuting the data without specifying the entire data length at the
beginning of the transmission.


And as this chunking is about transmission, the data size of the
entire message will be the same. CXF uses a file based caching for
large messages. So this message is usually never loaded entirely in
memory. However, some processing code (notably those using DOM based
XML parsing) will load the message entirely in memory and even worse
as it consumes a couple of times more than the original message size.
And this is your current memory problem. The newer ws-security code in
CXF 3.0.0x does not use DOM but the one in the lower versions
2.7.x/2.6.x uses DOM and affected by this memory consumption problem.

If your application can work on fragmented/segmented data (or chunked
as you call it), you can use that application level chunking so that
the entire data will never needed to be processed at once.


regards, aki

2014-03-12 23:57 GMT+01:00 Hart, Andrew B. <[email protected]>:
> All,
>
> Also, I'd like to explain the problems we've been having and have my 
> understanding of "chunking" clarified.
>
> Now, we have some web service operations which have the potential to fetch A 
> LOT of data.  Some of these have been written to be "asynchronous" and/or 
> "chunked".  But, I'm not utilizing CXF features to accomplish this; I mean 
> that we have written two endpoints, one to request data where information is 
> extracted and saved off in the database with a reference / correlation id.  
> Then, a second endpoint is used to fetch the data using the correlation id.
>
> In the case of the "chunked" web services, the client also supplies a chunk 
> number and so we fetch the desired subset of the rows and send them back.
>
> Now, in the case of the chunk-encoding that CXF provides, that is at the 
> transport level, and I understand that it is the default for responses over 
> 4K.  So, I can see responses coming back  with...
>
>  Wed Mar 12 17:28:57 CDT 2014:DEBUG:<< "Transfer-Encoding: chunked[\r][\n]"
>
> ...and I see it picking up the chunks until a zero length chunk is sent at 
> which time it is then GZipped, run thru WS-Security processing and the 
> response displayed.
>
> So, essentially, the transport layer chunking allows us to send larger 
> reponses back without the client losing the socket and/or timing out.  
> However, if we don't chunk at the *application* level, that means that, on 
> both the server and the client,  we have to construct and handle larger 
> response.  Most of my resource limitations have been encountered when 
> encrypting or decrypting very large responses, so transport level chunking 
> doesn't help with that.
>
> Is my understanding correct?   Is chunking at the application/service 
> implementation level the best, or most standard, way  of dealing with 
> problems like this?
>
> I'm completely skipping over my lack of understanding of creating 
> asynchronous web services.  I understand the part a Future object to the 
> client, the client uses that to get the results, but I could never wrap my 
> head around how it is implemented on the server side, where and how does the 
> server persist data until the client picks it up, etc.
>
> Regards,
>
> Andrew
>
>
>
>
>

Re: Another question re Chunking

Reply via email to