Two things.

Dom nearly always generates excessive object creation that under load stresses the GC of any JVM, I did some comparisons for smallish XML blocks between DOM and SAX (< 100 elements) a while back and there was perhapse 3x on speed and 4x on memory, I cant remember the precise details. I understand if its node manipulation thats required Dom is just easier and may be the same weight, but if its xml -> object then it may be better to consider sax or a push parser. Cocoon had some bad experience with dom in the request cycle in v1 ( al long time ago when parsers were even more creaky)

The other thing I have been told by IBMers is that their JVM makes it hard to replace the Xerces impl. I dont know if thats relevant.

The last 2 comments here http://jira.sakaiproject.org:8081/jira/ browse/SAK-14388
give some insight into a similar problem.


HTH
Ian
On 19 Sep 2008, at 13:59, Kevin Brown wrote:

I've noticed that the current performance of xml parsing is pretty bad.

I've got a patch ready to go to improve this substantially. On our internal
deployment it has cut down memory and CPU usage substantially and has
significantly improved overall response time.

Most of the changes that need to be made are compatible with fairly old xml parsers, but not so many work with pooling DocumentBuilders. For instance, xerces needs to be upgraded (I'm using 2.8.1, which is about 2 years old,
and that works).

Does anyone have any strong objections to upgrading xerces, or are there any
instances of xml parsers being used that also don't support
DocumentBuilder.reset ?

Reply via email to