Justin,
I have been able to solve the problem with the GeoServerImpl instances
accumulating. I think this was a global problem, so that is good news.
The bad new however, was that it was only a minor improvement and it
didn't solve the biggest memory leak which is specific to app-schema.
However, working on this made me understand what is going on there.
The biggest issue is in all the schema's that app-schema builds, custom
schema's but also the GSML. Because these also import the static XSD's
like GML, XS, OGC, etc... backward links from these keep them alive.
App-schema keeps building new GSML and other schema's without the old
ones ever being disposed of, for each datastore. The substitution groups
in GML keeps on growing and growing, containing many multiple references
to the same elements, but from different instances of the same schema.
ReferencingDirectiveLeakPreventer and SubstitutionGroupLeakPreventer
don't seem to be doing the trick for me. I didn't figure out why,
because I concluded that I couldn't really use them. Schema's should be
removed from memory as soon as they are not needed anymore, not until
someone figures out that there are duplicates (it really is too late
then). I wrote a method that properly removes all references to an
existing schema from all other schema's. This seems to do the trick for
the WFS schema's and stops GeoServerImpl and all the rest being kept
alive in memory.
The consequences for App-schema are big though. In the current setup,
you specify your schema's for each datastore (for each mapping). So even
in one single test, multiple versions of the GSML schema are alive. All
of them accumulate to multiple references in the substitution groups of
GML for the same element. It just does not make any possible sense to
link multiple versions of the GSML schema to one single GML schema in
memory! Because of the backwards links and substitution groups it
becomes one big clutter.
My solution is to keep a registry that maps schema locations to schema's
and reuse schema's that have already been built. Performance and memory
improvement at the same time.
In that case, the only way there could still be cluttering in the XSD
schema's in memory, if for some reason someone is using the same schema
in different file locations, or different versions of the same schema,
in different datastores of the same instance of app-schema. I don't see
a way to avoid that.
Regards
Niels
On 22/03/11 22:33, Justin Deoliveira wrote:
Yup, I have put much time into figuring out and squashing these
issues. Comments inline.
On Tue, Mar 22, 2011 at 4:02 AM, Niels <[email protected]> wrote:
I have been trying to figure out a memory leak in app-schema (and
I think, perhaps even in geoserver in general).
The problem is that the app-schema unit tests run out of memory
when they are ran in a batch by maven, never when they are run by
themselves. While maven is running the tests, data is accumulated
on the heap and at some point it will run out and crash. This is a
serious bug.
I have figured out quite a bit about it, using Java Memory Analyzer.
The data that is accumulating is mainly XSD schema information
(XSDElementDeclarationImpl etc...).
But no features or types or kept alive.
The other thing I figured out is that the GeoServerImpl , Catalog,
ResourcePool, etc.. objects are not disposed of. For every test
that has been run, these objects are kept on the heap! Although,
the abstract test class does get rid of them. My first assumption
was that these objects where somehow also keeping XSD information
in memory, but I was wrong: it is the other way around.
I have included a screenshot in the attachment that shows the
path that keeps the GeoserverImpl alive, through the XSD classes.
I have also added a second screenshot of another GeoserverImpl
instance's path ( in the same memory dump). If you look at the
addresses you can also see where these paths are the same and
where they split up.
XSD schema's can be imported in to each other, and that is what is
going on here.
Here is a summary of my findings
1. Almost all XSD classes are singletons, therefore static and
kept alive.
Done intentionally. They have to be cached since they are so expensive
to create.
2. org.geoserver.wfs.xml.v1_1_0.WFS is an exception this rule, it
is initiated and contains, indirectly, a link to the running
geoserverimpl.
Yes this is an issue, one i very much want to kill. I did some work to
fix this but app-schema soon because a blocker. It is kind of a
separate issue but long story short when we build up a schema object
we need to iterate over every type since app-schema types have
dependencies among them. If we can figure out how to process
those dependencies rather than just build a schema from all types we
get the side affect of getting rid of the WFSConfiguration
singleton... which is not a source of much pain.
3. For every instance of geoserverimpl, a new
org.geoserver.wfs.xml.v1_1_0.WFS is created and *imported* in all
the other (static) XSD classes (OGC, GML, etc).
It should just be GML, but yeah, part of the work is to fix this as
well.. reversing the importing dependency so as not to modify the gml
schema.
4. These imports accumulate - For each test a new import is added.
Take a look at GML.buildSchema and you should see two adapters which
attempt to manage this. ReferencingDirectiveLeakPreventer which
removes duplicate import statements so they do not accumulate. And
SubstitutionGroupLeakPreventer which prevents the same problem but
with the gml _Feature substitution group.
But what I cannot find, is where and how this
org.geoserver.wfs.xml.v1_1_0.WFS is imported in to the other XSD
classes. Is there anyone who can point me in the right direction?
Also, even if I do find the place where this happens - the
question remains what to do.
1. Just try to "undo" the import when closing down geoserver? In
that case I definitely need to figure out how the import is
happening in the first place.
2. Just clear out all of the static XSD classes, and rebuilt them
each time.
Definitely (1) or be prepared to wait hours for your tests to finish
:) Again if we can figure out the app-schema issue this will get a lot
better. Again the issue being that when we build an XSDSchema object
for a complex feature type, we need some way to traverse the feature
type dependency graph.. and pull in any dependencies into the
XSDSchema object rather than just build a schema with them all. Also
note that this is an issue that severely limits GeoServer to
containing many layers.
Regards
--
*Niels Charlier*
Software Engineer
CSIRO Earth Science and Resource Engineering
Phone: +61 8 6436 8914
Australian Resources Research Centre
26 Dick Perry Avenue, Kensington WA 6151
------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to
meet the
growing manageability and security demands of your customers.
Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your
software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Geoserver-devel mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/geoserver-devel
--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.
--
*Niels Charlier*
Software Engineer
CSIRO Earth Science and Resource Engineering
Phone: +61 8 6436 8914
Australian Resources Research Centre
26 Dick Perry Avenue, Kensington WA 6151
------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel