Looks like we've got a thread going, Eran!

Dan, I don't think anyone has done a performance analysis for a typed parser as such. It'd really need to be done in the context of some sort of data binding framework to be meaningful. The only thing which has been done along these lines that I'm aware of is Sun's "FAST Web Services", which merged mutant forms of JAXB and JAX-RPC so that they could do binary input/output. In their case they used ASN.1 encoding/decoding of the binary data, with the ASN.1 representation generated from an XML Schema.

They saw much faster performance than the conventional JAX-RPC code. But, my own JibxSoap (a subproject of JiBX, http://www.jibx.org) delivers performance that appears to be about as good while still using standard text XML. I say "appears to be" because at the time I did the web services performance comparisons (http://www.sosnoski.com/presents/cleansoap/comparing.html) the Sun stuff was all proprietary. They've since opened it up on java.net, I think, though I don't know what kind of license restrictions might apply.

My own gut feeling is that if I used a typed parser interface for binary input/output with JiBX/JibxSoap I could probably get 2-2.5 X the processing speed of text (vs. probably about 1.4-1.8 X with my XBIS binary XML format, which still keeps values as text and can be translated to and from the text representation).

There are actually some other areas where parser usability could be improved, though, besides implementing a typed interface. I think implementing a parser that supplied element and attribute names as singleton QName objects of some form (rather than separate namespace URI, local name, and qualified name text values) would be a big gain, for instance. The text APIs could also be better designed; in the case of the StAX XMLReader, rather than returning an array plus start offset plus length for element content, all using separate method calls, it'd be cleaner to just return the equivalent of a JDK 1.5 CharSequence (which could be reusable). Likewise on the attribute values, where StAX returns Strings. Returning CharSequence-equivalents would not only avoid unnecessary String creation (in the case of attribute values), it would also eliminate the need to translate the raw byte stream to character arrays for common encodings (especially the UTF-8 and UTF-16 used in BP-compliant web services).

Unfortunately, I think developers sometimes misapply Knuth's (or Hoare's - I'm not sure who got this started) "premature optimization is the root of all evil" aphorism by designing APIs without any thought to performance. Once performance bottlenecks have been built into the APIs it's very difficult to get around them without scrapping things and starting over.

 - Dennis

Dan Diephouse wrote:

Has anyone done any performance tests (binary or just plan text) with the typed stax stuff? Does it really make a difference?
- Dan

Eran Chinthaka wrote:

Hi Dennis,

You have commented on typed pull parser in wiki. Shall we start a thread
about it here ?

-- EC

-----Original Message-----
From: Apache Wiki [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 28, 2005 10:31 PM
To: [email protected]
Subject: [Ws Wiki] Update of "FrontPage/Axis2/Tasks/BinarySerialization"
by DennisSosnoski

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Ws Wiki" for
change notification.

The following page has been changed by DennisSosnoski:
http://wiki.apache.org/ws/FrontPage/Axis2/Tasks/BinarySerialization

--------------------------------------------------------------------------
----
 decoding the binary into an int, converting to a string for the parser
 API and then back to an int in the deserialisation code.

+ I (DennisSosnoski) would personally disagree with the above assessment.
A typed pull parser would definitely be nice, but even without this you
can get substantial size and performance gains from a binary format. See
my articles on devWorks at http://www-
128.ibm.com/developerworks/xml/library/x-trans1.html and http://www-
128.ibm.com/developerworks/xml/library/x-trans2/index.html for examples.
+






Reply via email to