[Axis2] Notes on underlying MXparser of OM

jayachandra Thu, 21 Apr 2005 05:58:34 -0700

Hi all!
Continuing my work on XMLConformance. I've naively implemented
OMComment, OMPI and OMDTD (makeshift vanilla implementation without
any sort of validation). And did the XMLConformance testing. In the
test suite provided by W3C there are a whole lot invalid and
ill-formed XMLs along with valid ones. In this phase I wanted to
concentrate how well we can deal with the valid ones, letting aside
rejecting invalid and ill-formed ones. So pruned for valid XMLs and
used them to test the OM comformance against them.
With the makeshift implementations for DTD, PI and OM though I
expected a 100% success in parsing the XML files, it didn't happen
quite so. Only 761 got parsed out of 960 input XMLs. In this
connection, I've observed a few limitations of the stAXparser we are
using that are worth mentioning and require serious attention


1. The MXParser doesn't seem to support the UTF-8 character set fully.
Japanese XML files weren't parsed properly. In future, this could
throw a serious problem. This can have it's effect on the SOAP message
processing of foreign web services.

2. The DTDParser inside MXParser was failing to understand the DTD
declaration line(s) of several (complex?) DTDs. Though this might not
seem as a problem if you look at the SOAP message processing part of
it, but certainly with such a behaviour complete XML infoset support
can not be given to our OM.

Thanks
Jaya
-- 
-- Jaya

[Axis2] Notes on underlying MXparser of OM

Reply via email to