Thanks for the response Daniel, After doing some more testing, it seems that even without chunking data gets streamed back to the client. Indeed there is a buffer but it seemed to me to be around the 24k mark which ended up causing a flush. This behaviour is good and what we need. However your comment regarding the CacheandWriteOutputstream not meant to be there is a bit concerning :) We are currently using 2.2.2 and will likely upgrade soon to the latest release. Could this be the cause? What is the expected outputstream?
-----Original Message----- From: Daniel Kulp [mailto:[email protected]] Sent: Wednesday, 6 January 2010 2:26 AM To: [email protected] Cc: Mustafa Sezgin Subject: Re: CXF and large XML request/responses : streaming support? On Mon January 4 2010 7:38:09 pm Mustafa Sezgin wrote: > Chunking could be a solution, as long as a write to the > CacheAndWriteOutputStream (which in turn causes a write to the Http end > point) causes a chunk to be sent. For Jetty, a chunk is sent whenever one of: 1) flush() is called on the stream. This flushes the chunk out and will start a new chunk. 2) The buffer it holds fills. I think by default it uses a 16K buffer. However, Jetty also needs to know that chunking is allowed. This is the tricky part and I really don't remember all the issues with it. I know if the request is a POST and comes in a chunked, jetty will respond chunked. That works perfect for our SOAP services based on posts. I don't know about GET requests. You MAY need to set an HTTP response header of Transfer-Encoding: chunked or similar to get jetty to chunk the response. I think with Jetty, it also won't reply with chunked if "Connection: close" is on the request. It only chunks with keep-alives. I think. I'm REALLY not sure at all about Tomcat. I've never looked into their code much. BTW: any idea why the CacheAndWriteOutputStream is there? Is logging turned on? By default, that shouldn't be there and just wonder why it is. :-) Dan > I will look into configuring jetty to > use chunking, you say this can be done via CXF, any documentation on this? > Hopefully Dan or Eoghan can provide some answers :) > > Thanks > > -----Original Message----- > From: Sergey Beryozkin [mailto:[email protected]] > Sent: Monday, 4 January 2010 10:22 PM > To: [email protected] > Subject: RE: CXF and large XML request/responses : streaming support? > > > Hi Mustafa > > Happy New Year to you too :-) > Dan should be online today so I'm hoping he will clarify, I'm off today > but will be online for a few days later this week. > I'm just wondering is it something the underlying container can be > configured to do, to stream back the data immediately after the > CacheAndWriteOutputStream has been given some data through its > httpresponse-connected stream ? > > Is it really the HTTP chunking that we are after here ? CXF can be used to > configure jetty to do it and Tomcat should be configurable as well. > Hope Dan or Eoghan can help here > > cheers, Sergey > > Hi Sergey, > > Happy new year and all :) > I don't see how returning a StreamingOutput object will help in our > instance. We do that at the moment for sending binary files back to the > client however these files are already on the disk. Our problem mainly > revolves around the fact that when we have a large number of objects in > memory to marshall, the marshalling itself uses up more memory resulting > in > a constant barrage of major GC's. What we would like to do is essentially > be > able to stream back the marshalling as it occurs. So ideally, somehow > configure JAXB so that as it performs marshalling of an object it sends > the > XML produced and then continues on with the next object to marshall. This > would relieve some of the memory pressure currently being produced on our > app servers. > > I think the ideal solution would be something where we can configure the > CXF > runtime to stream back responses as the marshalling is occurring rather > than > start the response streaming after all of the objects have been > marshalled. > This functionality would not be required for all methods, only some > specific > ones, so returning the to-be marshalled object in a wrapper object could > possibly also help us in applying this functionality to only a subset of > our > service methods rather than all.. > > I think another option may be to use some sort of outputstream which > writes > the marshalled XML to disk and then sends that back down the wire once the > > marshalling is complete thus not putting any pressure on memory. > > Now having said that, I have done some testing. It sort of turns out that > CXF may already be doing what we want (thus the memory pressure may be > caused by something in our app and not CXF/marshalling). To confirm this, > hopefully you can answer a few questions Sergey. > > It sort of seems that when a request is being processed by the outbound > interceptors, a CacheAndWriteOutputStream is used. This seems to have two > output streams it writes to. One which is the http end point > (AbstractHttpDestination.WrappedOutputstream) and the other which is an > internal output stream initially being a memory based buffer but then > being > converted into a file output stream which gets created if the amount of > data > being written is over a certain threshold. This is good. I have verified > that if a large amount of data is written the temp file is created and the > > rest of the generated xml is written there. My question remains though as > to > what happens when the write occurs on the http end point? As XML is being > generated and CacheAndWriteOutputStream.write is called (which calls > flowThroughStream.write with flowThroughStream being and instance of > AbstractHttpDestination.WrappedOutputstream) does this actually get sent > down the wire? It seems that I only get data visible on the client end > (via > a browser) when the MessageSenderEndingInterceptor closes the outputstream > > (CacheAndWriteOutputStream) which in turn does the flush and close on the > AbstractHttpDestination.WrappedOutputstream & file output stream.. > > Is this analysis correct Sergey? Or have i missed a vital bit of info > somewhere? BTW this is all done with the enableStreaming = false so I have > > not registered my own JaxBElementProvider... > > Thanks > > Mustafa > > > > -----Original Message----- > From: Sergey Beryozkin [mailto:[email protected]] > Sent: Thursday, 24 December 2009 3:30 AM > To: [email protected] > Subject: Re: CXF and large XML request/responses : streaming support? > > Hi > > >I have the need to stream large XML responses back to the client using > > Jax-RS. We have a large number of objects (Potentially upto a million) > > which > > need to be marshalled and the response returned, is the support for > > streaming XML responses while objects are being marshalled in CXF at the > > moment? We are currently seeing some large degradation in performance at > > times when these large number of entities are being marshalled. > > > > I basically return the objects which need to be marshalled from our > > Service > > methods, what would need to change for me to be able to make use of the > > streaming support? > > I can think of few options. I do believe the CXF runtime has all what is > needed to do the effective streaming back to the client but > I will need to ask Dan for some clarifications, some updates might need to > > be applied to CXF JAXRS. > > As far as JAXRS itself is concerned, you might want to choose to return an > > instance of StreamingOutput from a method. Or JAXP Source > and actually return an instance of CXF StaxSource. > If it is JAXB that you use then you may want to try explicitly registering > > JAXBElementProvider and setting an "enableStreaming" > boolean property on it in which case JAXBProvider will create an > XMLStreamWriter and pass it to JAXB Marshaller. This option looks > similar to explicitly returning an instance of StaxSource. > > Another option is to return a multipart formatted response, please see : > > http://cwiki.apache.org/CXF20DOC/jax-rs.html#JAX-RS-Writingattachments > > Another option which might be worth evaluating is to return a list of > links > back to a client (embedded in some minimal custom XML > instance) so that a client can fetch data from different links in parallel > > which might improve the overall experience...Similar > option is to update the interface for it to support the pagination... > > Let me know please what do you think is the best option for your project > and then we can focus on ensuring that option is supported > well by CXF JAXRS > > thanks, Sergey > > > -----Original Message----- > > From: Sergey Beryozkin [mailto:[email protected]] > > Sent: Friday, 9 October 2009 11:12 PM > > To: [email protected] > > Cc: rsmith > > Subject: Re: CXF and large XML request/responses : streaming support? > > > > Hi > > > >> It is interesting, especially the Stax support.I'm not familiar with > > the > > >> recent build of CXF, on this matter would it be also available for the > >> JAX-RS support. > > > > I missed it...I think in the case of JAXRS declaring a method accepting > > (JAXP) Source will work once > > I update a SourceProvider to check if XMLStreamReader is available on > > the > > > message (or create a new one if it is a multipart request) > > and then wrap it in StaxSource and just pass it on - will be done for > > 2.3; > > > if you need it working now then I can help you with > > creating a custom SourceProvider...The existing MultipartProvider will > > just > > delegate to it. > > > > thanks, Sergey > > > >> Anyway great framework :) > >> > >> On Thu, Oct 8, 2009 at 19:29, Daniel Kulp <[email protected]> wrote: > >>> Right now, with a JAX-WS provider, there is SOME support for this, but > >>> its > >>> far > >>> from ideal. This is an area I'll be working in next week (resolving > >>> customer > >>> issues) and I'll see if I can add some enhancements easily enough. > >>> > >>> Basically, right now, if you do Provider<Source>, you would get > >>> DOMSource > >>> in > >>> (thus, the incoming message would not be streamed, but you could > > return > > >>> a > >>> StreamSource or SAXSource orsimilar to use that we would use to copy > >>> stuff > >>> out. If you did Provider<StreamSource> or Provider<SAXSource>, we > >>> pull > >>> the > >>> full message into a Cached stream (which, for large messages, would > >>> output > >>> to > >>> temp files on disks) and return that to you. Thus, the whole thing > >>> isn't > >>> in > >>> memory, but it does result in the temp files and such. > >>> > >>> Part of what I hope to do next week is enable: > >>> Provider<XMLStreamReader> > >>> and/or > >>> Provider<StaxSource> > >>> which would allow full streaming in most cases. > >>> > >>> Dan > >>> > >>> On Wed October 7 2009 12:37:50 am rsmith wrote: > >>> > I'm trying to find out if CXF supports full streaming of input and > >>> > output > >>> > messages for the SOAP transport. > >>> > > >>> > I have a service that will be receiving large input XML payload, and > >>> > will > >>> > be generating a response with a large XML payload. I can process > > the > > >>> > input XML incrementally, generating the response as the input is > >>> > processed. > >>> > > >>> > Is there a way to implement a service in CXF streaming at all levels > >>> > (XML > >>> > parsing, data binding, generating response), avoiding holding the > > full > > >>> > document in memory at any time? > >>> > > >>> > I found several threads on the mailing list, some of which make it > >>> > sound > >>> > like it's supported. This message gave me the impression it may not > >>> > currently be supported though: > > http://www.nabble.com/Re%3A-Configuring-streaming-web-services%3A-error-on > - > > >>> > the-call-to-invoke-p24187339.html > >>> > > >>> > Some of the other threads: > > http://www.nabble.com/Looking-for-a-solution-for-Large-XML-Messages---stre > a > > >>> > ming-and-JAXWS-td20451942.html#a20451942 > > http://www.nabble.com/Recommended-way-to-have-a-web-method-stream-results- > > >>> > back-to-client--td22856243.html#a22864087 > >>> > http://www.nabble.com/SAXSource-td24411461.html#a24411461 > >>> > > >>> > Thanks in advance > >>> > >>> -- > >>> Daniel Kulp > >>> [email protected] > >>> http://www.dankulp.com/blog > -- Daniel Kulp [email protected] http://www.dankulp.com/blog
