Hello,Everyone,

      I'm a student applying for SoC <http://code.google.com/soc/>project Add
support for the StAX(JSR-173) cursor API to
Xerces-J<http://wiki.apache.org/general/SummerOfCode2007>.
Michael suggested I could discuss my proposal in the mailing list. So I
would like to introduce my thoughts and plan on this student project, any
comments are welcomed. : )

       The abstract description of project is: "To design and implement the
cursor-based 
XMLStreamReader<http://java.sun.com/javase/6/docs/api/javax/xml/stream/XMLStreamReader.html>(and
[image:
[WWW]] 
filtering<http://java.sun.com/javase/6/docs/api/javax/xml/stream/StreamFilter.html>support).
It should be possible to accomplish this using
XNI<http://xerces.apache.org/xerces2-j/xni.html>by building the
XMLStreamReader<http://java.sun.com/javase/6/docs/api/javax/xml/stream/XMLStreamReader.html>on
top of an
XMLPullParserConfiguration<http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLPullParserConfiguration.html>."


Besides XNI, there are several ways to implement StAX interface. For
example, parse the XML document as raw text and start from scratch,
including parsing characters, building token, and interpreting tokens, and
so on. Or to implement a converter from existing DOM or SAX interfaces.
However, after reading Xerces sources code, I found both SAX and DOM
implementations are based on XNI, so it's very natural to build StAX on XNI.


To implement XMLStreamReader, two important preconditions should be
confirmed.

1.       XML event information can be received.

2.       The pull style parsing process can be simulated.

When I look through the XNI interfaces, I found it actually meets these two
preconditions. The handler interfaces in XNI such as XMLDocumentHandler and
XMLDTDHandler can get XML events including startDocument and endDocument,
which can be easily mapped to StAX events accordingly.
XMLPullParserConfiguration<http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLPullParserConfiguration.html>interface
in XNI is used to represent a parser configuration that can be
used as the configuration for a "pull" parser, thus the pull parsing process
of StAX can be simulated by calling "boolean parse(boolean)" method in
XMLPullParserConfiguration<http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLPullParserConfiguration.html>
.

Then I looked through the current Xerces Implementation, I found
AbstractXMLDocumentParser class implements XMLDocumentHandler,
XMLDTDHandler, and XMLDTDContentModelHandler interfaces. Both
AbstractDOMParser and AbstractSAXParser extend from
AbstractXMLDocumentParser. So I think  I can implement an AbstractStAXParser
extending AbstractXMLDocumentParser to get XML events.

For example, code in current AbstractSAXParser:

*    public void comment(XMLString text, Augmentations augs) throws
XNIException { *

*        try {*

*            // SAX2 extension*

*            if (fLexicalHandler != null) {*

*                fLexicalHandler.comment(text.ch, 0, text.length);*

*            }*

*        }*

*        catch (SAXException e) {*

*            throw new XNIException(e);*

*        }*

*     } // comment(XMLString)*



And in my AbstractStAXPaser, it may be implemented like this,

*   public class AbstractStAXParser extends AbstractXMLDocumentParser {*

*        public int m_curEventType;*

*        public String m_characters;*

*       ….*

*       public void comment(XMLString text, Augmentations augs) throws
XNIException {*

*              m_curEventType = XMLStreamConstants.COMMENT;*

*              m_characters = new String(text.ch, text.offset, text.length);
*

*       }*

*      … *

*  }*

*
*

Meanwhile, 
XMLPullParserConfiguration<http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLPullParserConfiguration.html>will
be used to control the parsing process. XML11Configuration is the
implementation of XMLPullParserConfiguration interface in Xerces. I think I
can implement StAXPaserConfiguration which extends from XML11Configuration
for XML1.0 and XML 1.1. In runtime, AbstractStAXParser will be set as the
handlers of the StAXParserConfiguration instance.

As for XMLStreamReader, it can be implemented as this,

*public class StAXXMLStreamReaderr implements XMLStreamReader {*

*       public StAXPaserConfiguration m_configuration;*

*       public StAXParser m_parser;*

*       ….*

*      int 
getEventType<http://java.sun.com/javase/6/docs/api/javax/xml/stream/XMLStreamReader.html#getEventType%28%29>
()*

*{*

*    return m_parser.m_curEventType;*

*     }
*

*int  
next<http://java.sun.com/javase/6/docs/api/javax/xml/stream/XMLStreamReader.html#getEventType%28%29>
()*

*{*

*    m_configuration.parse(false)*

*return m_parser.m_curEventType;;*

*}*

*… *

*}*



     Above are some of my rough thoughts, so if you have any comments and
questions, I would like to discuss with you.



Thanks, Wei

Reply via email to