Subtitle 1: How to make SAX fly.
Subtitle 2: Should I use DOM instead?

My application retrieves several items (attributes and text) from large XML files. Such items are used to create a spreadsheet. The app is based on JAXP, and the code contains many lines like these:

cell.cellValue = oneItemAtATime(xmlFile, "//root/creator/@user");
cell.cellValue = oneItemAtATime(xmlFile, "//root/creator/@project");
cell.cellValue = oneItemAtATime(xmlFile, "//root/creator/@projectpath");
cell.cellValue = oneItemAtATime(xmlFile, "//root/creator/@title");
cell.cellValue = oneItemAtATime(xmlFile, "//root/creator/@notes");
cell.cellValue = oneItemAtATime(xmlFile, "//root/creator/@computer");

[...]

(note: the 'oneItemAtATime()' function implements an XPath query).

Performance is rather poor because the whole XML file is completely scanned by SAX (internally used by the 'XPathDemo' app that comes with the JAXP distro) for every item above. To make things worse, multiple XML files may be opened sequentially, and the items above are retrieved from each XML file.

I have considered two alternatives to improve performance, but would like to request some advice.

(1) Rewrite (how?, details are most welcome!) the XPath query, basing it on DOM. The way the file will be read only once from the filesystem. IOW, remove SAX from the application.

(2) Stick with SAX, but in a smart way. As it reads the XML tree, all the required items above are captured. Again, the question is how.

As they say, the devil is in the details.

TIA for sharing your expertise...

Regards,

-Ramon




---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-users-h...@xerces.apache.org

Reply via email to