Chunking could be a solution, as long as a write to the CacheAndWriteOutputStream (which in turn causes a write to the Http end point) causes a chunk to be sent. I will look into configuring jetty to use chunking, you say this can be done via CXF, any documentation on this? Hopefully Dan or Eoghan can provide some answers :)
Thanks -----Original Message----- From: Sergey Beryozkin [mailto:[email protected]] Sent: Monday, 4 January 2010 10:22 PM To: [email protected] Subject: RE: CXF and large XML request/responses : streaming support? Hi Mustafa Happy New Year to you too :-) Dan should be online today so I'm hoping he will clarify, I'm off today but will be online for a few days later this week. I'm just wondering is it something the underlying container can be configured to do, to stream back the data immediately after the CacheAndWriteOutputStream has been given some data through its httpresponse-connected stream ? Is it really the HTTP chunking that we are after here ? CXF can be used to configure jetty to do it and Tomcat should be configurable as well. Hope Dan or Eoghan can help here cheers, Sergey Hi Sergey, Happy new year and all :) I don't see how returning a StreamingOutput object will help in our instance. We do that at the moment for sending binary files back to the client however these files are already on the disk. Our problem mainly revolves around the fact that when we have a large number of objects in memory to marshall, the marshalling itself uses up more memory resulting in a constant barrage of major GC's. What we would like to do is essentially be able to stream back the marshalling as it occurs. So ideally, somehow configure JAXB so that as it performs marshalling of an object it sends the XML produced and then continues on with the next object to marshall. This would relieve some of the memory pressure currently being produced on our app servers. I think the ideal solution would be something where we can configure the CXF runtime to stream back responses as the marshalling is occurring rather than start the response streaming after all of the objects have been marshalled. This functionality would not be required for all methods, only some specific ones, so returning the to-be marshalled object in a wrapper object could possibly also help us in applying this functionality to only a subset of our service methods rather than all.. I think another option may be to use some sort of outputstream which writes the marshalled XML to disk and then sends that back down the wire once the marshalling is complete thus not putting any pressure on memory. Now having said that, I have done some testing. It sort of turns out that CXF may already be doing what we want (thus the memory pressure may be caused by something in our app and not CXF/marshalling). To confirm this, hopefully you can answer a few questions Sergey. It sort of seems that when a request is being processed by the outbound interceptors, a CacheAndWriteOutputStream is used. This seems to have two output streams it writes to. One which is the http end point (AbstractHttpDestination.WrappedOutputstream) and the other which is an internal output stream initially being a memory based buffer but then being converted into a file output stream which gets created if the amount of data being written is over a certain threshold. This is good. I have verified that if a large amount of data is written the temp file is created and the rest of the generated xml is written there. My question remains though as to what happens when the write occurs on the http end point? As XML is being generated and CacheAndWriteOutputStream.write is called (which calls flowThroughStream.write with flowThroughStream being and instance of AbstractHttpDestination.WrappedOutputstream) does this actually get sent down the wire? It seems that I only get data visible on the client end (via a browser) when the MessageSenderEndingInterceptor closes the outputstream (CacheAndWriteOutputStream) which in turn does the flush and close on the AbstractHttpDestination.WrappedOutputstream & file output stream.. Is this analysis correct Sergey? Or have i missed a vital bit of info somewhere? BTW this is all done with the enableStreaming = false so I have not registered my own JaxBElementProvider... Thanks Mustafa -----Original Message----- From: Sergey Beryozkin [mailto:[email protected]] Sent: Thursday, 24 December 2009 3:30 AM To: [email protected] Subject: Re: CXF and large XML request/responses : streaming support? Hi >I have the need to stream large XML responses back to the client using > Jax-RS. We have a large number of objects (Potentially upto a million) > which > need to be marshalled and the response returned, is the support for > streaming XML responses while objects are being marshalled in CXF at the > moment? We are currently seeing some large degradation in performance at > times when these large number of entities are being marshalled. > > I basically return the objects which need to be marshalled from our > Service > methods, what would need to change for me to be able to make use of the > streaming support? I can think of few options. I do believe the CXF runtime has all what is needed to do the effective streaming back to the client but I will need to ask Dan for some clarifications, some updates might need to be applied to CXF JAXRS. As far as JAXRS itself is concerned, you might want to choose to return an instance of StreamingOutput from a method. Or JAXP Source and actually return an instance of CXF StaxSource. If it is JAXB that you use then you may want to try explicitly registering JAXBElementProvider and setting an "enableStreaming" boolean property on it in which case JAXBProvider will create an XMLStreamWriter and pass it to JAXB Marshaller. This option looks similar to explicitly returning an instance of StaxSource. Another option is to return a multipart formatted response, please see : http://cwiki.apache.org/CXF20DOC/jax-rs.html#JAX-RS-Writingattachments Another option which might be worth evaluating is to return a list of links back to a client (embedded in some minimal custom XML instance) so that a client can fetch data from different links in parallel which might improve the overall experience...Similar option is to update the interface for it to support the pagination... Let me know please what do you think is the best option for your project and then we can focus on ensuring that option is supported well by CXF JAXRS thanks, Sergey > > -----Original Message----- > From: Sergey Beryozkin [mailto:[email protected]] > Sent: Friday, 9 October 2009 11:12 PM > To: [email protected] > Cc: rsmith > Subject: Re: CXF and large XML request/responses : streaming support? > > Hi > >> It is interesting, especially the Stax support.I'm not familiar with the >> recent build of CXF, on this matter would it be also available for the >> JAX-RS support. > > I missed it...I think in the case of JAXRS declaring a method accepting > (JAXP) Source will work once > I update a SourceProvider to check if XMLStreamReader is available on the > message (or create a new one if it is a multipart request) > and then wrap it in StaxSource and just pass it on - will be done for 2.3; > if you need it working now then I can help you with > creating a custom SourceProvider...The existing MultipartProvider will > just > delegate to it. > > thanks, Sergey > >> >> Anyway great framework :) >> >> >> On Thu, Oct 8, 2009 at 19:29, Daniel Kulp <[email protected]> wrote: >> >>> >>> Right now, with a JAX-WS provider, there is SOME support for this, but >>> its >>> far >>> from ideal. This is an area I'll be working in next week (resolving >>> customer >>> issues) and I'll see if I can add some enhancements easily enough. >>> >>> Basically, right now, if you do Provider<Source>, you would get >>> DOMSource >>> in >>> (thus, the incoming message would not be streamed, but you could return >>> a >>> StreamSource or SAXSource orsimilar to use that we would use to copy >>> stuff >>> out. If you did Provider<StreamSource> or Provider<SAXSource>, we >>> pull >>> the >>> full message into a Cached stream (which, for large messages, would >>> output >>> to >>> temp files on disks) and return that to you. Thus, the whole thing >>> isn't >>> in >>> memory, but it does result in the temp files and such. >>> >>> Part of what I hope to do next week is enable: >>> Provider<XMLStreamReader> >>> and/or >>> Provider<StaxSource> >>> which would allow full streaming in most cases. >>> >>> Dan >>> >>> >>> >>> On Wed October 7 2009 12:37:50 am rsmith wrote: >>> > I'm trying to find out if CXF supports full streaming of input and >>> > output >>> > messages for the SOAP transport. >>> > >>> > I have a service that will be receiving large input XML payload, and >>> > will >>> > be generating a response with a large XML payload. I can process the >>> > input XML incrementally, generating the response as the input is >>> > processed. >>> > >>> > Is there a way to implement a service in CXF streaming at all levels >>> > (XML >>> > parsing, data binding, generating response), avoiding holding the full >>> > document in memory at any time? >>> > >>> > I found several threads on the mailing list, some of which make it >>> > sound >>> > like it's supported. This message gave me the impression it may not >>> > currently be supported though: >>> > >>> http://www.nabble.com/Re%3A-Configuring-streaming-web-services%3A-error-on - >>> > the-call-to-invoke-p24187339.html >>> > >>> > Some of the other threads: >>> > >>> http://www.nabble.com/Looking-for-a-solution-for-Large-XML-Messages---stre a >>> > ming-and-JAXWS-td20451942.html#a20451942 >>> > >>> http://www.nabble.com/Recommended-way-to-have-a-web-method-stream-results- >>> > back-to-client--td22856243.html#a22864087 >>> > http://www.nabble.com/SAXSource-td24411461.html#a24411461 >>> > >>> > Thanks in advance >>> > >>> >>> -- >>> Daniel Kulp >>> [email protected] >>> http://www.dankulp.com/blog >>> >> >> >> >> -- >> Bryce >> >
