Two things.
Dom nearly always generates excessive object creation that under load
stresses the GC of any JVM, I did some comparisons for smallish XML
blocks between DOM and SAX (< 100 elements) a while back and there
was perhapse 3x on speed and 4x on memory, I cant remember the
precise details. I understand if its node manipulation thats required
Dom is just easier and may be the same weight, but if its xml ->
object then it may be better to consider sax or a push parser. Cocoon
had some bad experience with dom in the request cycle in v1 ( al long
time ago when parsers were even more creaky)
The other thing I have been told by IBMers is that their JVM makes it
hard to replace the Xerces impl. I dont know if thats relevant.
The last 2 comments here http://jira.sakaiproject.org:8081/jira/
browse/SAK-14388
give some insight into a similar problem.
HTH
Ian
On 19 Sep 2008, at 13:59, Kevin Brown wrote:
I've noticed that the current performance of xml parsing is pretty
bad.
I've got a patch ready to go to improve this substantially. On our
internal
deployment it has cut down memory and CPU usage substantially and has
significantly improved overall response time.
Most of the changes that need to be made are compatible with fairly
old xml
parsers, but not so many work with pooling DocumentBuilders. For
instance,
xerces needs to be upgraded (I'm using 2.8.1, which is about 2
years old,
and that works).
Does anyone have any strong objections to upgrading xerces, or are
there any
instances of xml parsers being used that also don't support
DocumentBuilder.reset ?