Exactly as expected... :-) Andreas
On Wed, Mar 25, 2009 at 05:02, Kim Horn <[email protected]> wrote: > Hello Andreas, > > This all works really well. Streaming uses no memory at all. > Got a java mediator also streaming payloads and with massive files never > uses much more than 40K for Synapse. > > Using just the OUT_ONLY property set; uses much more memory but it stabilises > and does not grow. > Thanks. > > -----Original Message----- > From: Andreas Veithen [mailto:[email protected]] > Sent: Friday, 20 March 2009 8:05 PM > To: [email protected] > Subject: Re: VFS - Synapse Memory Leak > > Of course the memory allocated to a message will be freed once the > message has been processed. That is why it's important to set the > OUT_ONLY property: if it is not set correctly, Synapse will keep the > message context (with the payload) in a callback table to correlate it > with a future response (which in your case never comes in). Probably > there is something to improve here in Synapse: > - The VFS transport should trigger an error if there is a mismatch > between the message exchange pattern and the transport configuration > of the service (the transport.vfs.* parameters). > - Synapse should start issuing warnings when the number of entries in > the callback table reaches a certain threshold. > > Andreas > > On Fri, Mar 20, 2009 at 01:41, Kim Horn <[email protected]> wrote: >> Not really; I cannot see why memory should permanently grow when I pass the >> same file >> repeatedly through VFS. In theory this means VFS will always consume all the >> available memory >> given enough time and file iterations. Therefore VFS cannot be used in a >> production system. >> This is definition of Memory Leak. I would expect SOME overhead on top of >> file size but >> I would assume the memory no longer required would be re-claimed. I would >> also assume >> The overhead was not 10 times the file size; seems excessive. >> >> Yes I understand the streaming approach should in theory use a fixed and >> much smaller amount of memory; >> but haven't tested that yet either. No reason given above memory leak that >> it should not permanently grow >> but at a smaller rate aswell. >> >> Thanks >> Kim >> >> -----Original Message----- >> From: Andreas Veithen [mailto:[email protected]] >> Sent: Friday, 20 March 2009 10:52 AM >> To: [email protected] >> Subject: Re: VFS - Synapse Memory Leak >> >> If N is the size of the file, the memory consumption caused by the >> transport is O(N) with transport.vfs.Streaming=false and O(1) with >> transport.vfs.Streaming=true. The getTextAsStream and writeTextTo >> methods in org.apache.axis2.format.ElementHelper are there to allow >> you to implement your mediator with O(1) memory usage, so that the >> overall memory consumption remains O(1). Does that answer your >> question? >> >> Andreas >> >> On Thu, Mar 19, 2009 at 23:33, Kim Horn <[email protected]> wrote: >>> It's the same Synapse.xml as specified originally and same trace. If you >>> are using Nabble you can see this, in case you lost the prior emails I can >>> post them again. >>> >>> I must admit I did not set those extra parameters, you mentioned, but I >>> don't see why you should set parameter to Stop a memory leak. I guessed >>> these parameter would just reduce the large amounts of memory it appears to >>> be using, e.g. 10 times the file size, via streaming ? Why is their 10 >>> copies of the data floating around ? Lots of buffering. This issue suggests >>> to me that any use of VFS will eventually kill the Server. Even with >>> smaller files it will eventually use all available memory. I guess I did >>> not understand the actual reason for this issue from prior discussion. >>> >>> I will try your extra parameters today though. >>> >>> Thanks >>> Kim >>> >>> >>> -----Original Message----- >>> From: Andreas Veithen [mailto:[email protected]] >>> Sent: Thursday, 19 March 2009 5:48 PM >>> To: [email protected] >>> Subject: Re: VFS - Synapse Memory Leak >>> >>> Kim, >>> >>> Can you post your current synapse.xml as well as the stack trace you get >>> now? >>> >>> Andreas >>> >>> On Thu, Mar 19, 2009 at 07:20, kimhorn <[email protected]> wrote: >>>> >>>> Using the last stable build from 15 March 2009 I still get exactly same >>>> behaviour as originally >>>> described with the above script. VFS still just dies. Would your fixes be >>>> in >>>> this ? >>>> >>>> Using the last st >>>> >>>> Andreas Veithen-2 wrote: >>>>> >>>>> I committed the code and it will be available in the next WS-Commons >>>>> transport build. The methods are located in >>>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base >>>>> module. >>>>> >>>>> Andreas >>>>> >>>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <[email protected]> wrote: >>>>>> Hello Andreas, >>>>>> This is great and really helps, have not had time to try it out but will >>>>>> soon. >>>>>> >>>>>> Contributing the java.io.Reader would be a great help but it will take me >>>>>> a while to get up to speed to do the Synapse iterator. >>>>>> >>>>>> In the short term I am going to use a brute force approach that is now >>>>>> feasible given the memory issue is resolved. Just thought of this one >>>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A >>>>>> POJOCommand on <out> to split file into another directory, stream in and >>>>>> out. Another independent VFS proxy watches that directory and submits >>>>>> each file to Web service. Hopefully memory will be fine. Overloading the >>>>>> destination may still be an issue ? >>>>>> >>>>>> Kim >>>>>> >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Andreas Veithen [mailto:[email protected]] >>>>>> Sent: Monday, 9 March 2009 10:55 PM >>>>>> To: [email protected] >>>>>> Subject: Re: VFS - Synapse Memory Leak >>>>>> >>>>>> The changes I did in the VFS transport and the message builders for >>>>>> text/plain and application/octet-stream certainly don't provide an >>>>>> out-of-the-box solution for your use case, but they are the >>>>>> prerequisite. >>>>>> >>>>>> Concerning your first proposed solution (let the VFS write the content >>>>>> to a temporary file), I don't like this because it would create a >>>>>> tight coupling between the VFS transport and the mediator. A design >>>>>> goal should be that the solution will still work if the file comes >>>>>> from another source, e.g. an attachment in an MTOM or SwA message. >>>>>> >>>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but >>>>>> this will require development of a custom mediator. This mediator >>>>>> would read the content, split it up (and store the chunks in memory or >>>>>> an disk) and executes a sub-sequence for each chunk. The execution of >>>>>> the sub-sequence would happen synchronously to limit the memory/disk >>>>>> space consumption (to the maximum chunk size) and to avoid flooding >>>>>> the destination service. >>>>>> >>>>>> Note that it is probably not possible to implemented the mediator >>>>>> using a script because of the problematic String handling. Also, >>>>>> Spring, POJO and class mediators don't support sub-sequences (I >>>>>> think). Therefore it should be implemented as a full-featured Java >>>>>> mediator, probably taking the existing iterate mediator as a template. >>>>>> I can contribute the required code to get the text content in the form >>>>>> of a java.io.Reader. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Andreas >>>>>> >>>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <[email protected]> wrote: >>>>>>> >>>>>>> Although this is a good feature it may not solve the actual problem ? >>>>>>> The main first issue on my list was the memory leak. >>>>>>> However, the real problem is once I get this massive files I have to >>>>>>> send >>>>>>> it to a web Service that can only take it in small chunks (about 14MB) . >>>>>>> Streaming it straight out would just kill the destination Web service. >>>>>>> It >>>>>>> would get the memory error. The text document can be split apart easily, >>>>>>> as >>>>>>> it has independant records on each line seperated by <CR> <LF>. >>>>>>> >>>>>>> In an earlier post; that was not responded too, I mentioned: >>>>>>> >>>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams >>>>>>> through >>>>>>> input file and outputs smaller >>>>>>> chunks for processing, in Synapse, may be a solution ? " >>>>>>> >>>>>>> So I had mentioned a few solutions, in prior posts, solution now are: >>>>>>> >>>>>>> 1) VFS writes straight to temporary file, then a Java mediator can >>>>>>> process >>>>>>> the file by splitting it into many smaller files. These files then >>>>>>> trigger >>>>>>> another VFS proxy that submits these to the final web Service. >>>>>>> The problem is is that is uses the file system (not so bad). >>>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping >>>>>>> into many XML <data> elements that can then be acted on by a Synapse >>>>>>> Iterator. So replace the text message with many smaller XML elements. >>>>>>> Problem is that this loads whole message into memory. >>>>>>> 3) Create another Iterator in Synapse that works on Regular expression >>>>>>> (to >>>>>>> split the text data) or actually uses a for loop approach to chop the >>>>>>> file >>>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K >>>>>>> chunk >>>>>>> 23 chunks into the data. >>>>>>> 4) Using the approach proposed now - just submit the file straight >>>>>>> (stream >>>>>>> it) to another web service that chops it up. It may return an XML >>>>>>> document >>>>>>> with many sub elelements that allows the standard Iterator to work. >>>>>>> Similar >>>>>>> to (2) but using another service rather than Java to split document. >>>>>>> 5) Using the approach proposed now - just submit the file straight >>>>>>> (stream >>>>>>> it) to another web service that chops it up but calls a Synapse proxy >>>>>>> with >>>>>>> each small packet of data that then forwards it to the final WEb >>>>>>> Service. So >>>>>>> the Web Service iterates across the data; and not Synapse. >>>>>>> >>>>>>> Then other solutions replace Synapse with a stand alone Java program at >>>>>>> the >>>>>>> front end. >>>>>>> >>>>>>> Another issue here is throttling: Splitting the file is one issues but >>>>>>> submitting 100's of calls in parralel to the destination service would >>>>>>> result in time outs... So need to work in throttling. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Ruwan Linton wrote: >>>>>>>> >>>>>>>> I agree and can understand the time factor and also +1 for reusing >>>>>>>> stuff >>>>>>>> than trying to invent the wheel again :-) >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Ruwan >>>>>>>> >>>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen >>>>>>>> <[email protected]>wrote: >>>>>>>> >>>>>>>>> Ruwan, >>>>>>>>> >>>>>>>>> It's not a question of possibility, it is a question of available time >>>>>>>>> :-) >>>>>>>>> >>>>>>>>> Also note that some of the features that we might want to implement >>>>>>>>> have some similarities with what is done for attachments in Axiom >>>>>>>>> (except that an attachment is only available once, while a file over >>>>>>>>> VFS can be read several times). I think there is also some existing >>>>>>>>> code in Axis2 that might be useful. We should not reimplement these >>>>>>>>> things but try to make the existing code reusable. This however is >>>>>>>>> only realistic for the next release after 1.3. >>>>>>>>> >>>>>>>>> Andreas >>>>>>>>> >>>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <[email protected]> >>>>>>>>> wrote: >>>>>>>>> > Andreas, >>>>>>>>> > >>>>>>>>> > Can we have the caching at the file system as a property to support >>>>>>>>> the >>>>>>>>> > multiple layers touching the full message and is it possible make it >>>>>>>>> to >>>>>>>>> > specify a threshold for streaming? For example if the message is >>>>>>>>> touched >>>>>>>>> > several time we might still need streaming but not for the 100KB or >>>>>>>>> lesser >>>>>>>>> > files. >>>>>>>>> > >>>>>>>>> > Thanks, >>>>>>>>> > Ruwan >>>>>>>>> > >>>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen < >>>>>>>>> [email protected]> >>>>>>>>> > wrote: >>>>>>>>> >> >>>>>>>>> >> I've done an initial implementation of this feature. It is >>>>>>>>> available >>>>>>>>> >> in trunk and should be included in the next nightly build. In order >>>>>>>>> to >>>>>>>>> >> enable this in your configuration, you need to add the following >>>>>>>>> >> property to the proxy: >>>>>>>>> >> >>>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter> >>>>>>>>> >> >>>>>>>>> >> You also need to add the following mediators just before the <send> >>>>>>>>> >> mediator: >>>>>>>>> >> >>>>>>>>> >> <property action="remove" name="transportNonBlocking" >>>>>>>>> scope="axis2"/> >>>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/> >>>>>>>>> >> >>>>>>>>> >> With this configuration Synapse will stream the data directly from >>>>>>>>> the >>>>>>>>> >> incoming to the outgoing transport without storing it in memory or >>>>>>>>> in >>>>>>>>> >> a temporary file. Note that this has two other side effects: >>>>>>>>> >> * The incoming file (or connection in case of a remote file) will >>>>>>>>> only >>>>>>>>> >> be opened on demand. In this case this happens during execution of >>>>>>>>> the >>>>>>>>> >> <send> mediator. >>>>>>>>> >> * If during the mediation the content of the file is needed several >>>>>>>>> >> time (which is not the case in your example), it will be read >>>>>>>>> several >>>>>>>>> >> times. The reason is of course that the content is not cached. >>>>>>>>> >> >>>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The >>>>>>>>> >> performance of the implementation is not yet optimal, but at least >>>>>>>>> the >>>>>>>>> >> memory consumption is constant. >>>>>>>>> >> >>>>>>>>> >> Some additional comments: >>>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and >>>>>>>>> SOAP >>>>>>>>> >> processing: this type of content is processed exactly as before. >>>>>>>>> >> * With the changes described here, we have now two different >>>>>>>>> policies >>>>>>>>> >> for plain text and binary content processing: in-memory caching + >>>>>>>>> no >>>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred >>>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we >>>>>>>>> >> should define a wider range of policies in the future, including >>>>>>>>> file >>>>>>>>> >> system caching + streaming. >>>>>>>>> >> * It is necessary to remove the transportNonBlocking property >>>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> >>>>>>>>> mediator >>>>>>>>> >> (more precisely the OperationClient) from executing the outgoing >>>>>>>>> >> transport in a separate thread. This property is set by the >>>>>>>>> incoming >>>>>>>>> >> transport. I think this is a bug since I don't see any valid reason >>>>>>>>> >> why the transport that handles the incoming request should >>>>>>>>> determine >>>>>>>>> >> the threading behavior of the transport that sends the outgoing >>>>>>>>> >> request to the target service. Maybe Asankha can comment on this? >>>>>>>>> >> >>>>>>>>> >> Andreas >>>>>>>>> >> >>>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >> > >>>>>>>>> >> > Thats good; as this stops us using Synapse. >>>>>>>>> >> > >>>>>>>>> >> > >>>>>>>>> >> > >>>>>>>>> >> > Asankha C. Perera wrote: >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: >>>>>>>>> Java >>>>>>>>> >> >>> heap >>>>>>>>> >> >>> space >>>>>>>>> >> >>> at >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99) >>>>>>>>> >> >>> at >>>>>>>>> >> >>> >>>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518) >>>>>>>>> >> >>> at java.lang.StringBuffer.append(StringBuffer.java:307) >>>>>>>>> >> >>> at java.io.StringWriter.write(StringWriter.java:72) >>>>>>>>> >> >>> at >>>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129) >>>>>>>>> >> >>> at >>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104) >>>>>>>>> >> >>> at >>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078) >>>>>>>>> >> >>> at >>>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382) >>>>>>>>> >> >>> at >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68) >>>>>>>>> >> >>> >>>>>>>>> >> >> Since the content type is text, the plain text formatter is >>>>>>>>> trying >>>>>>>>> to >>>>>>>>> >> >> use a String to parse as I see.. which is a problem for large >>>>>>>>> content.. >>>>>>>>> >> >> >>>>>>>>> >> >> A definite bug we need to fix .. >>>>>>>>> >> >> >>>>>>>>> >> >> cheers >>>>>>>>> >> >> asankha >>>>>>>>> >> >> >>>>>>>>> >> >> -- >>>>>>>>> >> >> Asankha C. Perera >>>>>>>>> >> >> AdroitLogic, http://adroitlogic.org >>>>>>>>> >> >> >>>>>>>>> >> >> http://esbmagic.blogspot.com >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> --------------------------------------------------------------------- >>>>>>>>> >> >> To unsubscribe, e-mail: [email protected] >>>>>>>>> >> >> For additional commands, e-mail: [email protected] >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> > >>>>>>>>> >> > -- >>>>>>>>> >> > View this message in context: >>>>>>>>> >> > >>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html >>>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com. >>>>>>>>> >> > >>>>>>>>> >> > >>>>>>>>> >> > >>>>>>>>> --------------------------------------------------------------------- >>>>>>>>> >> > To unsubscribe, e-mail: [email protected] >>>>>>>>> >> > For additional commands, e-mail: [email protected] >>>>>>>>> >> > >>>>>>>>> >> > >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> --------------------------------------------------------------------- >>>>>>>>> >> To unsubscribe, e-mail: [email protected] >>>>>>>>> >> For additional commands, e-mail: [email protected] >>>>>>>>> >> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > -- >>>>>>>>> > Ruwan Linton >>>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform" >>>>>>>>> > http://ruwansblog.blogspot.com/ >>>>>>>>> > >>>>>>>>> >>>>>>>>> --------------------------------------------------------------------- >>>>>>>>> To unsubscribe, e-mail: [email protected] >>>>>>>>> For additional commands, e-mail: [email protected] >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Ruwan Linton >>>>>>>> http://wso2.org - "Oxygenating the Web Services Platform" >>>>>>>> http://ruwansblog.blogspot.com/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: >>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html >>>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com. >>>>>>> >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: [email protected] >>>>>>> For additional commands, e-mail: [email protected] >>>>>>> >>>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [email protected] >>>>>> For additional commands, e-mail: [email protected] >>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [email protected] >>>>>> For additional commands, e-mail: [email protected] >>>>>> >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: [email protected] >>>>> For additional commands, e-mail: [email protected] >>>>> >>>>> >>>>> >>>> >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html >>>> Sent from the Synapse - Dev mailing list archive at Nabble.com. >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> For additional commands, e-mail: [email protected] >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
