stevedlawrence opened a new pull request #489:
URL: https://github.com/apache/incubator-daffodil/pull/489
The UStateForSuspensions and DirectOrBufferedDataOutputStream classes
have members that effectively create linked lists. In each of these
cases, we unknowingly hold onto the head of these linked lists, which
prevents garbage collection of all UStateForSuspensions and
DirectOrBufferedDataOutputStream instances. This means we essentially
hold on to all unparse state, which quickly leads to out of memory
errors for large format that require many suspensions.
- The first issue is the "prior" member of UStateMain/UStateForSuspensions.
This member is set so that each UState points to the previous
UStateForSuspension that has been created, essentially creating a
linked list of all UStateForSuspensions, with the head in UStateMain.
This prevents all UStateForSuspensions from being garbage collected,
as well all the state they point to (it's a lot).
Fortunately, this member isn't used anywhere anymore. Presumably it
was once used for debugging suspensions, but is no longer used or
needed. So we can simply remove this member so these
UStateForSuspensions can be garbage collected once the Suspensions
that use them are finished and garbage collected.
- The second issue is related to the "following" member in
DirectOrBufferedDataOutputStream's. This member is used too keep track
of the buffered DOS that follows this DOS (and iteratively, all
following DOS's). As the Direct DOS is finished, we make the following
DOS direct update pointers correctly. However, we create the very
first direct DOS in the "unparse" function, which means it lives on
the stack and cannot be garbage collected until unparse finished. And
because this DOS iteratively points to all following DOS's via the
"following" member, it means we can never free any DOS's (and all the
buffered data associated with those DOS's) until the end of unparse.
The solution in this case is to not create the initial direct DOS in
the unparse function on the stack, but instead to create it as part of
the UState creation when we initialize the "dataOutputStream" var.
This way there is no pointer to the initial DOS except for those held
in UState or Suspensions. As the UState mutates or Suspensions
resolve, we will complete lose a reference to earlier DOS's and they
can be garbage collected.
Fixing these two issues allows unparsing very large infosets that
require buffering, without running into out of memory errors.
DAFFODIL-2468
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]