Why Pull-Parser faster ? - Still NO answer

Ricky Ho Thu, 20 Feb 2003 17:51:20 -0800

Just between SAX and XPP ....

If SAX can be stopped anytime by throwing an exception, this means the application can control when to stop. And if XPP is not skipping whitespace ...

Why XPP is faster than SAX ??

Best regards,
Ricky

Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm
Reply-To: [EMAIL PROTECTED]
list-help: <mailto:[EMAIL PROTECTED]>
list-unsubscribe: <mailto:[EMAIL PROTECTED]>
list-post: <mailto:[EMAIL PROTECTED]>
Delivered-To: mailing list [EMAIL PROTECTED]
Date: Thu, 20 Feb 2003 20:34:06 -0500
From: Aleksander Slominski <[EMAIL PROTECTED]>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2) Gecko/20021126
X-Accept-Language: en-us, en
To: [EMAIL PROTECTED]
Subject: Re: Why Pull-Parser faster ?
X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N

Brain, Jim wrote:
In XPP (XML Pull Parser), the idea is like SAX, where no in-memory model is
kept, and the code is basically scanning through the XML.  However, As Anne
stated, you can stop in the middle of a parse in XPP, and continue later, or
start over, or whatever.  Also, most XPP parsers throw away some of the XML
information, like extra whitespace in order to gain more performance.  If
you have to scan the entire XML feed, XPP is still faster, because it throws
away information, but your most pronounced speedup in XPP is if you do a
conditional partial parse of the doc (It's much harder, if possible at all
to do a conditional partial parse using SAX.)
So, DOM is memory and 1+ scan, all XML entities (once by the parser, more by
your app)
SAX is no memory and 1 scan, all XML entities
XPP is no memory and 0-1 scan , not all XML entities (your app scans as it
desires)

Examples of things XPP throws away:

<jim>

<brain>Hi there</brain>

</jim>

In DOM and SAX, the whitespace between jim and brain is represented in the
model, because it might be necessary, but in XPP, the document gets
represented as:

<jim><brain>Hi there</brain></jim>

Reason:  XPP is tuned for SOAP and structured XML work, where the whitespace
and CRLF marks can be assumed to be there for prettiness only, and have no
code value.
hi Jim,

what you describe i think you refer to handling of element content when mixed content is disabled. that was possible in XPP2 but is no longer case in XPP3 that implements XmlPull API and XPP3 does not throw anything away but allows you to skip over if call next() method but do not retrieve any event information.

in XPP2 when optional mode to disable mixed content was activated you would get input as described in the example you gave.

in XmlPull to achieve similar result you can use nextText()/nextTag() to skip whitespaces if you desire so but you can also see all XML infoset events in input any time you want (just use next())

thanks,

alek

--
"Mr. Pauli, we in the audience are all agreed that your theory is crazy. What divides us is whether it is crazy enough to be true." Niels H. D. Bohr

Why Pull-Parser faster ? - Still NO answer

Reply via email to