Kai, Thanks for the sample object and detailed analysis. I agree that not keeping the entire DigitalObject in memory would help a whole lot here, but would involve some pretty significant changes to code. Without doing that, I still think we can shave off significant amounts of required heap memory when doing modifications of objects. I've started a branch to begin working on this.
Just curious, is VERSIONABLE="false" an option for you with RELS-EXT? If so, that would be a big improvement in your case because otherwise, all the old versions of RELS-EXT are read into memory during any request involving the object. Thanks, Chris On Tue, Aug 19, 2008 at 10:37 AM, Strnad, Kai <[EMAIL PROTECTED]> wrote: > Hi all, > > thanks a lot for your ideas and suggestions. Separating the audit trail from > the digital object is certainly helpful in reducing the overall size of the > DO. There are cases though where the audit trail is only a minor part of the > object and therefore removing it may not have the desired impact. > > I've attached an example object from our test suite we are able to > consistently reproduce the OutOfMemorError with. The object is not unusual in > terms of size or amount of datastreams, so it should be a realistic sample > (there are of course much bigger objects...). Attached you will also find a > screenshot of the heap dump at the time the error occurred. The dump was > analyzed with the Eclipse Memory Analyzer. In order to illustrate the problem > and quickly provoke an error i set the heap size to 64m. > > Also, i've attached the stack trace from the fedora.log. The trace shows that > the OutOfMemoryError occurrs at DOTranslationUtility.writeToStream:808. The > thread doesn't stop there however because the error gets caught by a > catch(Throwable t) clause which catches exceptions as well as errors and then > proceeds with the normal execution. > > As already stated in my previous mail, looking at the heap dump screenshot i > see the following problems: > * The StringBuffer the digital object is kept in is 4 times the size of the > digital object in the worst case. > * Indentation takes up lots of space, it would be helpful to make the > serializer (and consequently the deserializer) customizable. > * Keeping several copies of the entire digital object in memory when it is > not needed puts additional strain on the heap. > > I think there are two possible quick fixes. > * Increase heap space accordingly so the peak never gets critical. This is > however problematic due to the huge objects we are sometimes dealing with. > Unless we use a very large heap this only delays the error. > * Trim the StringBuffer before writing the digital object to the stream > and/or preallocate capacities on initialization. This would significantly > reduce the size of the digital object in memory, will however not solve the > underlying issue - but again delay it. > > In order to permanently solve the issue it should be avoided having the whole > digital object around when it is not needed. Being able to control the XML > indentation would also help. > > - Kai > > > > -----Ursprüngliche Nachricht----- > Von: Razum, Matthias > Gesendet: Mittwoch, 13. August 2008 17:50 > An: 'Daniel Davis' > Cc: Chris Wilper; [email protected]; Strnad, Kai > Betreff: RE: [Fedora-commons-developers] Fedora OutOfMemoryErrors > > Dan, > > Don't get me wrong. I'm happy to see so many people looking into the issue, > and any idea is worthwhile discussing :-) > > We'll provide you with an example FOXML asap. Meanwhile, we will do some > further profiling on our side as well. Kai has an idea for a very simple fix, > but he needs to proof that the fix works in general and not just for his test > case. > > Matthias. > >> -----Original Message----- >> From: Daniel Davis [mailto:[EMAIL PROTECTED] >> Sent: Wednesday, August 13, 2008 5:34 PM >> To: Razum, Matthias >> Cc: Chris Wilper; >> [email protected]; Strnad, Kai >> Subject: Re: [Fedora-commons-developers] Fedora OutOfMemoryErrors >> >> We want a record of events close to the digital object too! >> There may be reasons to ALSO write events to the log but for >> right now Chris is just trying to find out what is >> happening---before suggesting how to fix it. It would be >> helpful for you to send us an example FOXML object that >> provokes the problem. >> >> -- Dan >> >> Razum, Matthias wrote: >> >> Dan and Chris, >> >> My two cents: My first reaction on Chris' proposal to >> separate the audit >> trail from the DO was disbelief. I always thought that >> one of the >> striking features of Fedora and FOXML is keeping stuff >> that belongs >> together in one XML structure that can be validated any >> time. When asked >> why someone should use Fedora, this is one of my top >> arguments. NARA-RLG >> has far more expertise and experience than I have, so I >> should probably >> dump my arguments and think of some new ones. >> >> Still, I would be concerned about long-term >> preservation of my DO's. If >> I start splitting it up my DO (well, Fedora does that >> already with >> managed content, so it's not introducing anything new), >> preservation >> becomes even more challenging. With my very little >> knowledge about >> PREMIS and the idea to track all changes to an object >> as events, isn't >> that exactly what the audit trail is good for? So would >> I want to keep >> it as an integral part of my object? >> >> Actually, for eSciDoc we can perfectly live without the >> audit trail, as >> we write our own PREMIS-based event datastream for >> graphs of objects, so >> both changes combined would probably boost the number >> of versions before >> we run into out-of-memory errors. >> >> Matthias. >> >> >> >> >> -----Original Message----- >> From: Daniel Davis [mailto:[EMAIL PROTECTED] >> Sent: Tuesday, August 12, 2008 5:21 PM >> To: Chris Wilper >> Cc: Razum, Matthias; >> >> [email protected]; Strnad, Kai >> Subject: Re: [Fedora-commons-developers] Fedora >> OutOfMemoryErrors >> >> The NARA-RLG report thinks that the "audit" >> should be kept >> separate from the "object" anyway because of >> the potential of >> tampering. With correlation information kept >> in the log, >> this information could be kept in >> server/logs/audit.log which >> would be periodically snipped off and stored as >> a non-inlined >> Datastream in a sequence of repository >> generated objects that >> record change history. >> >> This would make it harder in the future to make digital >> object change operations idempotent because it >> is convenient >> to have that information localized to the >> digital object in >> question. Moving large audit trails to non-inlined >> Datastreams which are still encapsulated by the digital >> object would permit separate, though less >> convenient, processing. >> >> I am curious because the XML for fifty items >> should not be >> large enough for a reasonable memory model >> unless the traffic >> is very heavy. I have not looked at that code >> and I wonder >> if we can move to a delayed object creation >> scheme to reduce >> the size of the business objects representing >> the digital >> object in working memory. I know we are >> looking for a quick >> fix not a refactoring but I am still curious. >> >> -- Dan >> >> Chris Wilper wrote: >> >> Kai and Matthias, >> >> Just wanted to let you know I've been >> doing some >> profiling on this >> over here. I suspect saving the audit >> records external >> to the FOXML >> would help a LOT with this. One idea >> is to avoid the >> special "AUDIT" >> datastream altogether and save them in >> server/logs/audit.log instead. >> Later refactorings could address the >> issue of having to read the >> entire DigitalObject to make a change >> to one piece, but I think >> dealing with the ever-growing "AUDIT" >> datastream would >> be a simple way >> to stop the bleeding. Thoughts on this >> approach? >> >> - Chris >> >> 2008/7/29 Razum, Matthias >> <[EMAIL PROTECTED]> >> <mailto:[EMAIL PROTECTED]> >> <mailto:[EMAIL PROTECTED]> >> <mailto:[EMAIL PROTECTED]> : >> >> >> Hi all, >> >> This is a pretty severe bug for >> us. We run into >> the issue when we try to >> create a new version of an >> object with ~50 >> previous versions. This is a >> not-so-rare condition if we >> want to add members >> to a collection, thus >> creating versions of the >> collection object. >> >> I haven't seen any feedback for >> this bug report >> on the list from the >> Fedora dev team, and I can't find it in >> Fedora's bugtracker on >> sourceforge.net. Any reaction >> from the Fedora >> team would be highly >> appreciated, even though I am >> aware of the >> pressure from the upcoming >> Fedora 3.0 release. >> >> Cheers, >> Matthias. >> >> >> >> >> >> -----Original Message----- >> From: >> [EMAIL PROTECTED] >> >> >> [mailto:[EMAIL PROTECTED] >> >> >> t] On Behalf Of Strnad, Kai >> >> >> Sent: Monday, July 14, >> 2008 11:47 AM >> To: >> [email protected] >> Subject: >> [Fedora-commons-developers] >> Fedora OutOfMemoryErrors >> >> Hi all, >> >> we frequently encounter >> OutOfMemoryErrors when calling >> modifyDatastreamByValue >> and other API-M >> methods on relatively large >> digital objects using >> Fedora Commons >> 3.0b1 and 3.0b2. In >> order to better >> understand the issue we >> triggered heap >> dumps and analyzed them. The >> dumps revealed that up >> to 140M of heap >> space get used by Fedora when >> calling >> modifyDatastreamByValue on a >> digital object of 15M. >> >> In order to provoke >> heap dumps at each >> api call the heap size was >> reduced. Additionally >> we triggered heap >> dumps at specific locations >> programmatically using >> the Java6 >> HotSpotDiagnosticMXBean. >> >> The OutOfMemoryError >> always occurs at >> >> DOTranslationUtility.writeToStream() >> after the serialization. This >> appears to be the peak >> of heap usage >> for modifyDatastreamByValue. >> The heap dump shows the >> following >> composition of objects at >> the time of >> writeToStream() (see >> attached screenshot): >> * StringBuffer (60M) (15M * 2 >> (internal UTF-16 representation)) + 30M >> memory allocated by >> StringBuffer >> (StringBuffer doubles its capacity >> automatically when unsufficient >> capacity is left for appending a new >> String. Hence the >> capacity is likely to >> exceed the actual >> memory needed >> unless explicitly allocated). >> * char[] array at >> writeToStream >> (StringBuffer.toString()) >> (31M) (15M * >> 2 + overhead) >> * BasicDigitalObject 24M (15M >> DatastreamXMLMetadata, 9M AuditRecord) >> * DOReaderCache 25M (1 >> BasicDigitalObject in cache at the time) >> * Some other small objects >> >> If the heap space is >> already consumed >> to a large extent, allocating >> another chunk of memory >> may fail and >> subsequently trigger an >> OutOfMemoryError. >> Explicitly calling >> the garbage collector is not a >> viable option, because >> most of the >> objects involved are still bound >> locally to the thread, >> so they are >> still reachable. >> >> Increasing the heap >> will solve the >> issue temporarily. Depending on the >> size of the digital >> object the problem >> may however resurface: Suppose >> the digital object is 30M, then >> according to our findings a heap space >> of 60M*2 StringBuffer + >> 60M char array >> + ~50M DO + ~50M cache = 280M >> would be needed for a >> single digital >> object (we haven't tried this >> however). >> >> We modified the Fedora >> code and tried >> the following options: >> * We removed the >> indentation in the >> FOXMLDOSerializer and >> DOTranslationUtility. >> Removing most of >> the nonessential >> whitespaces (or >> replacing indentation >> whitespaces with >> tabs) results in a much smaller >> DO size (about 20% in >> our test case) >> and therefore reduces memory >> footprint. >> >> * As for the >> StringBuffer problem we >> basically tried two >> approaches. We >> trimmed the StringBuffer in >> FOXMLDOSerializer before the call to >> writeToStream() using >> the trimToSize() >> method. This adjusts >> the capacity >> of the StringBuffer to >> the actual size >> of characters contained within. >> Another option is to >> explicitly size the buffer. >> >> * The 64 bit version of >> Java consumes >> considerably more heap space >> compared to the 32 bit >> version. Using a >> 32 bit version reduces memory >> usage. >> >> All options mentioned >> above work well >> and reduce memory consumption >> significantly, but >> solve the underlying >> problem only partially. >> >> Perhaps a better >> solution would be to >> load and process only >> those parts >> of the digital object >> needed for the >> current operation (not viable for >> ingest, but e.g. >> modifyDatastreamByX), >> but that would probably involve >> lots of refactoring... >> >> Has anyone had to deal >> with this issue >> previously ? Any insights or >> suggestions would be great. >> >> >> Thank you very much, >> Kai >> >> >> ________________________________ >> >> >> >> >> ------------------------------------------------------- >> >> Fachinformationszentrum Karlsruhe, Gesellschaft >> für wissenschaftlich-technische Information mbH. >> Sitz der Gesellschaft: >> Eggenstein-Leopoldshafen, Amtsgericht Mannheim HRB 101892. >> Geschäftsführerin: Sabine Brünger-Weilandt. >> Vorsitzender des Aufsichtsrats: MinR Hermann Riehl. >> >> >> >> >> -- >> Daniel W. Davis >> Chief Software Architect, Fedora Commons >> Researcher, Cornell Information Science >> http://www.fedora-commons.org >> [EMAIL PROTECTED] >> [EMAIL PROTECTED] >> (607) 255-6090 (Office) >> >> > > > ------------------------------------------------------- > > Fachinformationszentrum Karlsruhe, Gesellschaft für > wissenschaftlich-technische Information mbH. > Sitz der Gesellschaft: Eggenstein-Leopoldshafen, Amtsgericht Mannheim HRB > 101892. > Geschäftsführerin: Sabine Brünger-Weilandt. > Vorsitzender des Aufsichtsrats: MinR Hermann Riehl. > > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Fedora-commons-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers
