wei duan wrote:
> Hello,Everyone,
>
> I'm a student applying for SoC <http://code.google.com/soc/>project
> Add support for the StAX(JSR-173) cursor API to Xerces-J
> <http://wiki.apache.org/general/SummerOfCode2007>. Michael suggested I
> could discuss my proposal in the mailing list. So I would like to
> introduce my thoughts and plan on this student project, any comments
> are welcomed. : )
>
> The abstract description of project is: "To design and implement the
> cursor-based XMLStreamReader
> <http://java.sun.com/javase/6/docs/api/javax/xml/stream/XMLStreamReader.html>
> (and [WWW]filtering
> <http://java.sun.com/javase/6/docs/api/javax/xml/stream/StreamFilter.html>
> support). It should be possible to accomplish this using XNI
> <http://xerces.apache.org/xerces2-j/xni.html> by building the
> XMLStreamReader
> <http://java.sun.com/javase/6/docs/api/javax/xml/stream/XMLStreamReader.html>
> on top of an XMLPullParserConfiguration
> <http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLPullParserConfiguration.html>."
>
>
> Besides XNI, there are several ways to implement StAX interface. For
> example, parse the XML document as raw text and start from scratch,
> including parsing characters, building token, and interpreting tokens,
> and so on. Or to implement a converter from existing DOM or SAX
> interfaces. However, after reading Xerces sources code, I found both
> SAX and DOM implementations are based on XNI, so it's very natural to
> build StAX on XNI.
>
> To implement XMLStreamReader, two important preconditions should be
> confirmed.
>
> 1. XML event information can be received.
>
> 2. The pull style parsing process can be simulated.
>
> When I look through the XNI interfaces, I found it actually meets
> these two preconditions. The handler interfaces in XNI such as
> XMLDocumentHandler and XMLDTDHandler can get XML events including
> startDocument and endDocument, which can be easily mapped to StAX
> events accordingly. XMLPullParserConfiguration
> <http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLPullParserConfiguration.html>
> interface in XNI is used to represent a parser configuration that can
> be used as the configuration for a "pull" parser, thus the pull
> parsing process of StAX can be simulated by calling "boolean
> parse(boolean)" method in XMLPullParserConfiguration
> <http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLPullParserConfiguration.html>
> .
>
> Then I looked through the current Xerces Implementation, I found
> AbstractXMLDocumentParser class implements XMLDocumentHandler,
> XMLDTDHandler, and XMLDTDContentModelHandler interfaces. Both
> AbstractDOMParser and AbstractSAXParser extend from
> AbstractXMLDocumentParser. So I think I can implement an
> AbstractStAXParser extending AbstractXMLDocumentParser to get XML events.
>
> For example, code in current AbstractSAXParser:
>
> / public void comment(XMLString text, Augmentations augs) throws
> XNIException { /
>
> / try {/
>
> / // SAX2 extension/
>
> / if (fLexicalHandler != null) {/
>
> / fLexicalHandler.comment(text.ch, 0, text.length);/
>
> / }/
>
> / }/
>
> / catch (SAXException e) {/
>
> / throw new XNIException(e);/
>
> / }/
>
> / } // comment(XMLString)/
>
> And in my AbstractStAXPaser, it may be implemented like this,
>
> / public class AbstractStAXParser extends AbstractXMLDocumentParser {/
>
> / public int m_curEventType;/
>
> / public String m_characters;/
>
> / …./
>
> / public void comment(XMLString text, Augmentations augs) throws
> XNIException {/
>
> / m_curEventType = XMLStreamConstants.COMMENT;/
>
> / m_characters = new String(text.ch, text.offset, text.length); /
>
> / }/
>
> / … /
>
> / }/
>
> /
> /
>
> Meanwhile, XMLPullParserConfiguration
> <http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLPullParserConfiguration.html>
> will be used to control the parsing process. XML11Configuration is the
> implementation of XMLPullParserConfiguration interface in Xerces. I
> think I can implement StAXPaserConfiguration which extends from
> XML11Configuration for XML1.0 and XML 1.1. In runtime,
> AbstractStAXParser will be set as the handlers of the
> StAXParserConfiguration instance.
>
> As for XMLStreamReader, it can be implemented as this,
>
> /public class StAXXMLStreamReaderr implements XMLStreamReader {/
>
> / public StAXPaserConfiguration m_configuration;/
>
> / public StAXParser m_parser;/
>
> / …./
>
> / int getEventType
> <http://java.sun.com/javase/6/docs/api/javax/xml/stream/XMLStreamReader.html#getEventType%28%29>()/
>
> /{/
>
> / return m_parser.m_curEventType;/
>
> / }
> /
>
> /int next
> <http://java.sun.com/javase/6/docs/api/javax/xml/stream/XMLStreamReader.html#getEventType%28%29>()/
>
> /{/
>
> / m_configuration.parse(false)/
>
> /return m_parser.m_curEventType;;/
>
> /}/
>
> /… /
>
> /}/
>
>
> Above are some of my rough thoughts, so if you have any comments and
> questions, I would like to discuss with you.
>
hi Wei,

i have implemented pull parser on top of Xerces XNI using above approach
however i discovered that /parse(false) /will *not* parse exactly one
event it is closer to parseSome() - you never know so you need to buffer
them in potentially very large stack (unlimited grow size) or run
parsing in separate thread and use blocking queueu with limited capacity
and have pull parser read events fromit in user thread. for details you
can checkout my code from CVS at
http://sourceforge.net/projects/xni2xmlpull/

best,

alek

-- 
The best way to predict the future is to invent it - Alan Kay


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to