G'day all,

I have been given the task of generating WSDL from my companies large
collection of application models, as well as handling the invoking of
corresponding services which are already deployed. The number of possible
services numbers in the hundreds, with a handful of large (2MB) shared
shemas.

When trying to run a small Jetty server with more than one of these
generated WSDLs I quickly ran out of memory (the default setting - 64M I
think). While it wouldn't be hard to bump up the memory allocation, I feared
the final scenario of hundreds of WSDLs would be problematic even for large
amounts of memory.

To cut a long story short this is what I found:

1. For each WSDL, every imported schema is loaded into memory, regardless of
whether it is shared among other WSDLs.
2. Every Schema DOM tree is stored in memory after parsing.

Given that the Schema is parsed to the more useful XmlSchema object tree,
I'm not sure what benefits are gained from keeping it in DOM. I fixed the
memory bloat by some minor changes in SchemaUtil, which I will explain
briefly here. Note that reflection was unfortunately required in dealing
with the XmlSchema library.

1. Used a static map to update the XmlSchemaCollection parameter with any
cached Schemas before calling schemaCol.read(schemaElem, systemId); in
extractSchema

2. Nulled out cached DOM elements in the following:

   - extractSchema() -> xmlSchema.setElement() (well actually I stopped
   it being set)
   - addSchema() -> schema.setElement() after targetNamespace is
   retrieved
   - At the end of getSchemas() iterate any new schemas, get its
   NodeNamespaceContext, call getDeclaredPrefixes() before settings its node
   field to null.

3. Ignored schemaList from the constructor and instead just relied on an
internal set to avoid recursion. (I think this map is only needed on the
WSDL2Java?)
4. Fixed WSDLQueryHandler to output full WSDL due to missing schema node (I
loaded it from the file system instead of serialising the Definition object)

I guess my biggest qualm in all this is that it was extremely difficult to
subclass and spring SchemaUtil to make the required changes. In particular I
had to reproduce the following invocation class chain to fix the problem.

JaxWsServiceFactoryBean -> buildServiceFromWSDL() -> WSDLServiceFactory ->
create() -> WSDLServiceBuilder -> getSchemas() -> SchemaUtil

Because SchemaUtil isn't a sprung object, nor any of the other classes, and
because most of the methods/fields are private I ended up literally
copy+pasting each class.

Forgive me if this all sounds like criticism, because I am very impressed
and happy with CXF. This is just as much a documenting of my findings as
anything else.

Anyway. I'm not too worried about what happens now but I am curious what you
guys think of all this.

Cheers,

Charles O'Farrell

Reply via email to