stevedlawrence opened a new pull request #489:
URL: https://github.com/apache/incubator-daffodil/pull/489


   The UStateForSuspensions and DirectOrBufferedDataOutputStream classes
   have members that effectively create linked lists. In each of these
   cases, we unknowingly hold onto the head of these linked lists, which
   prevents garbage collection of all UStateForSuspensions and
   DirectOrBufferedDataOutputStream instances. This means we essentially
   hold on to all unparse state, which quickly leads to out of memory
   errors for large format that require many suspensions.
   
   - The first issue is the "prior" member of UStateMain/UStateForSuspensions.
     This member is set so that each UState points to the previous
     UStateForSuspension that has been created, essentially creating a
     linked list of all UStateForSuspensions, with the head in UStateMain.
     This prevents all UStateForSuspensions from being garbage collected,
     as well all the state they point to (it's a lot).
   
     Fortunately, this member isn't used anywhere anymore. Presumably it
     was once used for debugging suspensions, but is no longer used or
     needed. So we can simply remove this member so these
     UStateForSuspensions can be garbage collected once the Suspensions
     that use them are finished and garbage collected.
   
   - The second issue is related to the "following" member in
     DirectOrBufferedDataOutputStream's. This member is used too keep track
     of the buffered DOS that follows this DOS (and iteratively, all
     following DOS's). As the Direct DOS is finished, we make the following
     DOS direct update pointers correctly. However, we create the very
     first direct DOS in the "unparse" function, which means it lives on
     the stack and cannot be garbage collected until unparse finished. And
     because this DOS iteratively points to all following DOS's via the
     "following" member, it means we can never free any DOS's (and all the
     buffered data associated with those DOS's) until the end of unparse.
   
     The solution in this case is to not create the initial direct DOS in
     the unparse function on the stack, but instead to create it as part of
     the UState creation when we initialize the "dataOutputStream" var.
     This way there is no pointer to the initial DOS except for those held
     in UState or Suspensions. As the UState mutates or Suspensions
     resolve, we will complete lose a reference to earlier DOS's and they
     can be garbage collected.
   
   Fixing these two issues allows unparsing very large infosets that
   require buffering, without running into out of memory errors.
   
   DAFFODIL-2468


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to