Hi Niels,
On Thu, Mar 24, 2011 at 12:34 AM, Niels <[email protected]> wrote:
> Justin,
>
> I have been able to solve the problem with the GeoServerImpl instances
> accumulating. I think this was a global problem, so that is good news. The
> bad new however, was that it was only a minor improvement and it didn't
> solve the biggest memory leak which is specific to app-schema. However,
> working on this made me understand what is going on there.
>
> The biggest issue is in all the schema's that app-schema builds, custom
> schema's but also the GSML. Because these also import the static XSD's like
> GML, XS, OGC, etc... backward links from these keep them alive. App-schema
> keeps building new GSML and other schema's without the old ones ever being
> disposed of, for each datastore. The substitution groups in GML keeps on
> growing and growing, containing many multiple references to the same
> elements, but from different instances of the same schema.
>
> ReferencingDirectiveLeakPreventer and SubstitutionGroupLeakPreventer don't
> seem to be doing the trick for me. I didn't figure out why, because I
> concluded that I couldn't really use them. Schema's should be removed from
> memory as soon as they are not needed anymore, not until someone figures out
> that there are duplicates (it really is too late then). I wrote a method
> that properly removes all references to an existing schema from all other
> schema's. This seems to do the trick for the WFS schema's and stops
> GeoServerImpl and all the rest being kept alive in memory.
>
Not sure I follow. Those preventer adapters should now allow the same
element or schema to be added more than once... so I am curious as to how
the back references are accumulating. So while maybe right away the objects
won't be deferenced but eventually hey should be. Although if you are
building up multiple different schemas I can see how this could be an
issue.
>
> The consequences for App-schema are big though. In the current setup, you
> specify your schema's for each datastore (for each mapping). So even in one
> single test, multiple versions of the GSML schema are alive. All of them
> accumulate to multiple references in the substitution groups of GML for the
> same element. It just does not make any possible sense to link multiple
> versions of the GSML schema to one single GML schema in memory! Because of
> the backwards links and substitution groups it becomes one big clutter.
>
Right... but is that not the problem that the preventers "prevent" :). I am
curious as to why they are not working on this case.
>
> My solution is to keep a registry that maps schema locations to schema's
> and reuse schema's that have already been built. Performance and memory
> improvement at the same time.
>
> In that case, the only way there could still be cluttering in the XSD
> schema's in memory, if for some reason someone is using the same schema in
> different file locations, or different versions of the same schema, in
> different datastores of the same instance of app-schema. I don't see a way
> to avoid that.
>
Another possible solution would be to create some sort of wrapper, or a
registry in which every time client code builds a schema they should
register it. And when the schema object is not required any longer there is
some dispose method called. When that occurs the schema removes itself from
any schemas that it imported or referenced via substitution group.
>
> Regards
> Niels
>
>
> On 22/03/11 22:33, Justin Deoliveira wrote:
>
> Yup, I have put much time into figuring out and squashing these issues.
> Comments inline.
>
> On Tue, Mar 22, 2011 at 4:02 AM, Niels
> <[email protected]><[email protected]>wrote:
>
>> I have been trying to figure out a memory leak in app-schema (and I
>> think, perhaps even in geoserver in general).
>>
>> The problem is that the app-schema unit tests run out of memory when they
>> are ran in a batch by maven, never when they are run by themselves. While
>> maven is running the tests, data is accumulated on the heap and at some
>> point it will run out and crash. This is a serious bug.
>>
>> I have figured out quite a bit about it, using Java Memory Analyzer.
>> The data that is accumulating is mainly XSD schema information
>> (XSDElementDeclarationImpl etc...).
>> But no features or types or kept alive.
>>
>> The other thing I figured out is that the GeoServerImpl , Catalog,
>> ResourcePool, etc.. objects are not disposed of. For every test that has
>> been run, these objects are kept on the heap! Although, the abstract test
>> class does get rid of them. My first assumption was that these objects where
>> somehow also keeping XSD information in memory, but I was wrong: it is the
>> other way around.
>>
>> I have included a screenshot in the attachment that shows the path that
>> keeps the GeoserverImpl alive, through the XSD classes. I have also added a
>> second screenshot of another GeoserverImpl instance's path ( in the same
>> memory dump). If you look at the addresses you can also see where these
>> paths are the same and where they split up.
>> XSD schema's can be imported in to each other, and that is what is going
>> on here.
>>
>> Here is a summary of my findings
>> 1. Almost all XSD classes are singletons, therefore static and kept alive.
>>
> Done intentionally. They have to be cached since they are so expensive to
> create.
>
>> 2. org.geoserver.wfs.xml.v1_1_0.WFS is an exception this rule, it is
>> initiated and contains, indirectly, a link to the running geoserverimpl.
>>
> Yes this is an issue, one i very much want to kill. I did some work to fix
> this but app-schema soon because a blocker. It is kind of a separate issue
> but long story short when we build up a schema object we need to iterate
> over every type since app-schema types have dependencies among them. If we
> can figure out how to process those dependencies rather than just build a
> schema from all types we get the side affect of getting rid of the
> WFSConfiguration singleton... which is not a source of much pain.
>
> 3. For every instance of geoserverimpl, a new
>> org.geoserver.wfs.xml.v1_1_0.WFS is created and *imported* in all the other
>> (static) XSD classes (OGC, GML, etc).
>>
> It should just be GML, but yeah, part of the work is to fix this as well..
> reversing the importing dependency so as not to modify the gml schema.
>
>
>> 4. These imports accumulate - For each test a new import is added.
>>
>
> Take a look at GML.buildSchema and you should see two adapters which
> attempt to manage this. ReferencingDirectiveLeakPreventer which removes
> duplicate import statements so they do not accumulate. And
> SubstitutionGroupLeakPreventer which prevents the same problem but with the
> gml _Feature substitution group.
>
>>
>> But what I cannot find, is where and how this
>> org.geoserver.wfs.xml.v1_1_0.WFS is imported in to the other XSD classes. Is
>> there anyone who can point me in the right direction?
>>
>> Also, even if I do find the place where this happens - the question
>> remains what to do.
>> 1. Just try to "undo" the import when closing down geoserver? In that case
>> I definitely need to figure out how the import is happening in the first
>> place.
>> 2. Just clear out all of the static XSD classes, and rebuilt them each
>> time.
>>
>>
> Definitely (1) or be prepared to wait hours for your tests to finish :)
> Again if we can figure out the app-schema issue this will get a lot better.
> Again the issue being that when we build an XSDSchema object for a complex
> feature type, we need some way to traverse the feature type dependency
> graph.. and pull in any dependencies into the XSDSchema object rather than
> just build a schema with them all. Also note that this is an issue
> that severely limits GeoServer to containing many layers.
>
> Regards
>>
>> --
>> *Niels Charlier*
>>
>> Software Engineer
>> CSIRO Earth Science and Resource Engineering
>> Phone: +61 8 6436 8914
>>
>> Australian Resources Research Centre
>> 26 Dick Perry Avenue, Kensington WA 6151
>>
>>
>> ------------------------------------------------------------------------------
>> Enable your software for Intel(R) Active Management Technology to meet the
>> growing manageability and security demands of your customers. Businesses
>> are taking advantage of Intel(R) vPro (TM) technology - will your software
>> be a part of the solution? Download the Intel(R) Manageability Checker
>> today! http://p.sf.net/sfu/intel-dev2devmar
>> _______________________________________________
>> Geoserver-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/geoserver-devel
>>
>>
>
>
> --
> Justin Deoliveira
> OpenGeo - http://opengeo.org
> Enterprise support for open source geospatial.
>
>
>
> --
> *Niels Charlier*
>
> Software Engineer
> CSIRO Earth Science and Resource Engineering
> Phone: +61 8 6436 8914
>
> Australian Resources Research Centre
> 26 Dick Perry Avenue, Kensington WA 6151
>
--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.
------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel