Re: [Fedora-commons-developers] Fedora OutOfMemoryErrors

Chris Wilper Tue, 19 Aug 2008 08:29:18 -0700

Kai,

Thanks for the sample object and detailed analysis.  I agree that not
keeping the entire DigitalObject in memory would help a whole lot
here, but would involve some pretty significant changes to code.
Without doing that, I still think we can shave off significant amounts
of required heap memory when doing modifications of objects.  I've
started a branch to begin working on this.


Just curious, is VERSIONABLE="false" an option for you with RELS-EXT?
If so, that would be a big improvement in your case because otherwise,
all the old versions of RELS-EXT are read into memory during any
request involving the object.

Thanks,
Chris

On Tue, Aug 19, 2008 at 10:37 AM, Strnad, Kai
<[EMAIL PROTECTED]> wrote:
> Hi all,
>
> thanks a lot for your ideas and suggestions. Separating the audit trail from 
> the digital object is certainly helpful in reducing the overall size of the 
> DO. There are cases though where the audit trail is only a minor part of the 
> object and therefore removing it may not have the desired impact.
>
> I've attached an example object from our test suite we are able to 
> consistently reproduce the OutOfMemorError with. The object is not unusual in 
> terms of size or amount of datastreams, so it should be a realistic sample 
> (there are of course much bigger objects...). Attached you will also find a 
> screenshot of the heap dump at the time the error occurred. The dump was 
> analyzed with the Eclipse Memory Analyzer. In order to illustrate the problem 
> and quickly provoke an error i set the heap size to 64m.
>
> Also, i've attached the stack trace from the fedora.log. The trace shows that 
> the OutOfMemoryError occurrs at DOTranslationUtility.writeToStream:808. The 
> thread doesn't stop there however because the error gets caught by a 
> catch(Throwable t) clause which catches exceptions as well as errors and then 
> proceeds with the normal execution.
>
> As already stated in my previous mail, looking at the heap dump screenshot i 
> see the following problems:
> * The StringBuffer the digital object is kept in is 4 times the size of the 
> digital object in the worst case.
> * Indentation takes up lots of space, it would be helpful to make the 
> serializer (and consequently the deserializer) customizable.
> * Keeping several copies of the entire digital object in memory when it is 
> not needed puts additional strain on the heap.
>
> I think there are two possible quick fixes.
> * Increase heap space accordingly so the peak never gets critical. This is 
> however problematic due to the huge objects we are sometimes dealing with. 
> Unless we use a very large heap this only delays the error.
> * Trim the StringBuffer before writing the digital object to the stream 
> and/or preallocate capacities on initialization. This would significantly 
> reduce the size of the digital object in memory, will however not solve the 
> underlying issue - but again delay it.
>
> In order to permanently solve the issue it should be avoided having the whole 
> digital object around when it is not needed. Being able to control the XML 
> indentation would also help.
>
> - Kai
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: Razum, Matthias
> Gesendet: Mittwoch, 13. August 2008 17:50
> An: 'Daniel Davis'
> Cc: Chris Wilper; [email protected]; Strnad, Kai
> Betreff: RE: [Fedora-commons-developers] Fedora OutOfMemoryErrors
>
> Dan,
>
> Don't get me wrong. I'm happy to see so many people looking into the issue, 
> and any idea is worthwhile discussing :-)
>
> We'll provide you with an example FOXML asap. Meanwhile, we will do some 
> further profiling on our side as well. Kai has an idea for a very simple fix, 
> but he needs to proof that the fix works in general and not just for his test 
> case.
>
> Matthias.
>
>> -----Original Message-----
>> From: Daniel Davis [mailto:[EMAIL PROTECTED]
>> Sent: Wednesday, August 13, 2008 5:34 PM
>> To: Razum, Matthias
>> Cc: Chris Wilper;
>> [email protected]; Strnad, Kai
>> Subject: Re: [Fedora-commons-developers] Fedora OutOfMemoryErrors
>>
>> We want a record of events close to the digital object too!
>> There may be reasons to ALSO write events to the log but for
>> right now Chris is just trying to find out what is
>> happening---before suggesting how to fix it.  It would be
>> helpful for you to send us an example FOXML object that
>> provokes the problem.
>>
>> -- Dan
>>
>> Razum, Matthias wrote:
>>
>>       Dan and Chris,
>>
>>       My two cents: My first reaction on Chris' proposal to
>> separate the audit
>>       trail from the DO was disbelief. I always thought that
>> one of the
>>       striking features of Fedora and FOXML is keeping stuff
>> that belongs
>>       together in one XML structure that can be validated any
>> time. When asked
>>       why someone should use Fedora, this is one of my top
>> arguments. NARA-RLG
>>       has far more expertise and experience than I have, so I
>> should probably
>>       dump my arguments and think of some new ones.
>>
>>       Still, I would be concerned about long-term
>> preservation of my DO's. If
>>       I start splitting it up my DO (well, Fedora does that
>> already with
>>       managed content, so it's not introducing anything new),
>> preservation
>>       becomes even more challenging. With my very little
>> knowledge about
>>       PREMIS and the idea to track all changes to an object
>> as events, isn't
>>       that exactly what the audit trail is good for? So would
>> I want to keep
>>       it as an integral part of my object?
>>
>>       Actually, for eSciDoc we can perfectly live without the
>> audit trail, as
>>       we write our own PREMIS-based event datastream for
>> graphs of objects, so
>>       both changes combined would probably boost the number
>> of versions before
>>       we run into out-of-memory errors.
>>
>>       Matthias.
>>
>>
>>
>>
>>               -----Original Message-----
>>               From: Daniel Davis [mailto:[EMAIL PROTECTED]
>>               Sent: Tuesday, August 12, 2008 5:21 PM
>>               To: Chris Wilper
>>               Cc: Razum, Matthias;
>>
>> [email protected]; Strnad, Kai
>>               Subject: Re: [Fedora-commons-developers] Fedora
>> OutOfMemoryErrors
>>
>>               The NARA-RLG report thinks that the "audit"
>> should be kept
>>               separate from the "object" anyway because of
>> the potential of
>>               tampering.  With correlation information kept
>> in the log,
>>               this information could be kept in
>> server/logs/audit.log which
>>               would be periodically snipped off and stored as
>> a non-inlined
>>               Datastream in a sequence of repository
>> generated objects that
>>               record change history.
>>
>>               This would make it harder in the future to make digital
>>               object change operations idempotent because it
>> is convenient
>>               to have that information localized to the
>> digital object in
>>               question.  Moving large audit trails to non-inlined
>>               Datastreams which are still encapsulated by the digital
>>               object would permit separate, though less
>> convenient, processing.
>>
>>               I am curious because the XML for fifty items
>> should not be
>>               large enough for a reasonable memory model
>> unless the traffic
>>               is very heavy.  I have not looked at that code
>> and I wonder
>>               if we can move to a delayed object creation
>> scheme to reduce
>>               the size of the business objects representing
>> the digital
>>               object in working memory.  I know we are
>> looking for a quick
>>               fix not a refactoring but I am still curious.
>>
>>               -- Dan
>>
>>               Chris Wilper wrote:
>>
>>                       Kai and Matthias,
>>
>>                       Just wanted to let you know I've been
>> doing some
>>               profiling on this
>>                       over here.  I suspect saving the audit
>> records external
>>               to the FOXML
>>                       would help a LOT with this.  One idea
>> is to avoid the
>>               special "AUDIT"
>>                       datastream altogether and save them in
>>               server/logs/audit.log instead.
>>                       Later refactorings could address the
>> issue of having to read the
>>                       entire DigitalObject to make a change
>> to one piece, but I think
>>                       dealing with the ever-growing "AUDIT"
>> datastream would
>>               be a simple way
>>                       to stop the bleeding.  Thoughts on this
>> approach?
>>
>>                       - Chris
>>
>>                       2008/7/29 Razum, Matthias
>>               <[EMAIL PROTECTED]>
>> <mailto:[EMAIL PROTECTED]>
>>               <mailto:[EMAIL PROTECTED]>
>> <mailto:[EMAIL PROTECTED]>  :
>>
>>
>>                               Hi all,
>>
>>                               This is a pretty severe bug for
>> us. We run into
>>               the issue when we try to
>>                               create a new version of an
>> object with ~50
>>               previous versions. This is a
>>                               not-so-rare condition if we
>> want to add members
>>               to a collection, thus
>>                               creating versions of the
>> collection object.
>>
>>                               I haven't seen any feedback for
>> this bug report
>>               on the list from the
>>                               Fedora dev team, and I can't find it in
>>               Fedora's bugtracker on
>>                               sourceforge.net. Any reaction
>> from the Fedora
>>               team would be highly
>>                               appreciated, even though I am
>> aware of the
>>               pressure from the upcoming
>>                               Fedora 3.0 release.
>>
>>                               Cheers,
>>                               Matthias.
>>
>>
>>
>>
>>
>>                                       -----Original Message-----
>>                                       From:
>>               [EMAIL PROTECTED]
>>
>>
>> [mailto:[EMAIL PROTECTED]
>>
>>
>>                               t] On Behalf Of Strnad, Kai
>>
>>
>>                                       Sent: Monday, July 14,
>> 2008 11:47 AM
>>                                       To:
>>               [email protected]
>>                                       Subject:
>> [Fedora-commons-developers]
>>               Fedora OutOfMemoryErrors
>>
>>                                       Hi all,
>>
>>                                       we frequently encounter
>>               OutOfMemoryErrors when calling
>>                                       modifyDatastreamByValue
>> and other API-M
>>               methods on relatively large
>>                                       digital objects using
>> Fedora Commons
>>               3.0b1 and 3.0b2. In
>>                                       order to better
>>                                       understand the issue we
>> triggered heap
>>               dumps and analyzed them. The
>>                                       dumps revealed that up
>> to 140M of heap
>>               space get used by Fedora when
>>                                       calling
>> modifyDatastreamByValue on a
>>               digital object of 15M.
>>
>>                                       In order to provoke
>> heap dumps at each
>>               api call the heap size was
>>                                       reduced. Additionally
>> we triggered heap
>>               dumps at specific locations
>>                                       programmatically using
>> the Java6
>>               HotSpotDiagnosticMXBean.
>>
>>                                       The OutOfMemoryError
>> always occurs at
>>
>> DOTranslationUtility.writeToStream()
>>               after the serialization. This
>>                                       appears to be the peak
>> of heap usage
>>               for modifyDatastreamByValue.
>>                                       The heap dump shows the
>> following
>>               composition of objects at
>>                                       the time of
>>                                       writeToStream() (see
>> attached screenshot):
>>                                        * StringBuffer (60M) (15M * 2
>>               (internal UTF-16 representation)) + 30M
>>                                       memory allocated by
>> StringBuffer
>>               (StringBuffer doubles its capacity
>>                                       automatically when unsufficient
>>               capacity is left for appending a new
>>                                       String. Hence the
>> capacity is likely to
>>               exceed the actual
>>                                       memory needed
>>                                       unless explicitly allocated).
>>                                        * char[] array at
>> writeToStream
>>               (StringBuffer.toString())
>>                                       (31M) (15M *
>>                                       2 + overhead)
>>                                        * BasicDigitalObject 24M (15M
>>               DatastreamXMLMetadata, 9M AuditRecord)
>>                                        * DOReaderCache 25M (1
>>               BasicDigitalObject in cache at the time)
>>                                        * Some other small objects
>>
>>                                       If the heap space is
>> already consumed
>>               to a large extent, allocating
>>                                       another chunk of memory
>> may fail and
>>               subsequently trigger an
>>                                       OutOfMemoryError.
>> Explicitly calling
>>               the garbage collector is not a
>>                                       viable option, because
>> most of the
>>               objects involved are still bound
>>                                       locally to the thread,
>> so they are
>>               still reachable.
>>
>>                                       Increasing the heap
>> will solve the
>>               issue temporarily. Depending on the
>>                                       size of the digital
>> object the problem
>>               may however resurface: Suppose
>>                                       the digital object is 30M, then
>>               according to our findings a heap space
>>                                       of 60M*2 StringBuffer +
>> 60M char array
>>               + ~50M DO + ~50M cache = 280M
>>                                       would be needed for a
>> single digital
>>               object (we haven't tried this
>>                                       however).
>>
>>                                       We modified the Fedora
>> code and tried
>>               the following options:
>>                                       * We removed the
>> indentation in the
>>               FOXMLDOSerializer and
>>                                       DOTranslationUtility.
>> Removing most of
>>               the nonessential
>>                                       whitespaces (or
>>                                       replacing indentation
>> whitespaces with
>>               tabs) results in a much smaller
>>                                       DO size (about 20% in
>> our test case)
>>               and therefore reduces memory
>>                                       footprint.
>>
>>                                       * As for the
>> StringBuffer problem we
>>               basically tried two
>>                                       approaches. We
>>                                       trimmed the StringBuffer in
>>               FOXMLDOSerializer before the call to
>>                                       writeToStream() using
>> the trimToSize()
>>               method. This adjusts
>>                                       the capacity
>>                                       of the StringBuffer to
>> the actual size
>>               of characters contained within.
>>                                       Another option is to
>> explicitly size the buffer.
>>
>>                                       * The 64 bit version of
>> Java consumes
>>               considerably more heap space
>>                                       compared to the 32 bit
>> version. Using a
>>               32 bit version reduces memory
>>                                       usage.
>>
>>                                       All options mentioned
>> above work well
>>               and reduce memory consumption
>>                                       significantly, but
>> solve the underlying
>>               problem only partially.
>>
>>                                       Perhaps a better
>> solution would be to
>>               load and process only
>>                                       those parts
>>                                       of the digital object
>> needed for the
>>               current operation (not viable for
>>                                       ingest, but e.g.
>> modifyDatastreamByX),
>>               but that would probably involve
>>                                       lots of refactoring...
>>
>>                                       Has anyone had to deal
>> with this issue
>>               previously ? Any insights or
>>                                       suggestions would be great.
>>
>>
>>                                       Thank you very much,
>>                                       Kai
>>
>>
>> ________________________________
>>
>>
>>
>>
>>               -------------------------------------------------------
>>
>>               Fachinformationszentrum Karlsruhe, Gesellschaft
>> für wissenschaftlich-technische Information mbH.
>>               Sitz der Gesellschaft:
>> Eggenstein-Leopoldshafen, Amtsgericht Mannheim HRB 101892.
>>               Geschäftsführerin: Sabine Brünger-Weilandt.
>>               Vorsitzender des Aufsichtsrats: MinR Hermann Riehl.
>>
>>
>>
>>
>> --
>> Daniel W. Davis
>> Chief Software Architect, Fedora Commons
>> Researcher, Cornell Information Science
>> http://www.fedora-commons.org
>> [EMAIL PROTECTED]
>> [EMAIL PROTECTED]
>> (607) 255-6090 (Office)
>>
>>
>
>
> -------------------------------------------------------
>
> Fachinformationszentrum Karlsruhe, Gesellschaft für 
> wissenschaftlich-technische Information mbH.
> Sitz der Gesellschaft: Eggenstein-Leopoldshafen, Amtsgericht Mannheim HRB 
> 101892.
> Geschäftsführerin: Sabine Brünger-Weilandt.
> Vorsitzender des Aufsichtsrats: MinR Hermann Riehl.
>
>
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Re: [Fedora-commons-developers] Fedora OutOfMemoryErrors

Reply via email to