Re: [Axis2] Binary Serialisation

Anne Thomas Manes Fri, 29 Jul 2005 05:38:06 -0700

This project seems to provide substantial performance advantages in
XML processing. Take a look at the benchmark paper:
http://www.ximpleware.com/.


Anne

On 7/29/05, Dennis Sosnoski <[EMAIL PROTECTED]> wrote:
> Looks like we've got a thread going, Eran!
> 
> Dan, I don't think anyone has done a performance analysis for a typed
> parser as such. It'd really need to be done in the context of some sort
> of data binding framework to be meaningful. The only thing which has
> been done along these lines that I'm aware of is Sun's "FAST Web
> Services", which merged mutant forms of JAXB and JAX-RPC so that they
> could do binary input/output. In their case they used ASN.1
> encoding/decoding of the binary data, with the ASN.1 representation
> generated from an XML Schema.
> 
> They saw much faster performance than the conventional JAX-RPC code.
> But, my own JibxSoap (a subproject of JiBX, http://www.jibx.org)
> delivers performance that appears to be about as good while still using
> standard text XML. I say "appears to be" because at the time I did the
> web services performance comparisons
> (http://www.sosnoski.com/presents/cleansoap/comparing.html) the Sun
> stuff was all proprietary. They've since opened it up on java.net, I
> think, though I don't know what kind of license restrictions might apply.
> 
> My own gut feeling is that if I used a typed parser interface for binary
> input/output with JiBX/JibxSoap I could probably get 2-2.5 X the
> processing speed of text (vs. probably about 1.4-1.8 X with my XBIS
> binary XML format, which still keeps values as text and can be
> translated to and from the text representation).
> 
> There are actually some other areas where parser usability could be
> improved, though, besides implementing a typed interface. I think
> implementing a parser that supplied element and attribute names as
> singleton QName objects of some form (rather than separate namespace
> URI, local name, and qualified name text values) would be a big gain,
> for instance. The text APIs could also be better designed; in the case
> of the StAX XMLReader, rather than returning an array plus start offset
> plus length for element content, all using separate method calls, it'd
> be cleaner to just return the equivalent of a JDK 1.5 CharSequence
> (which could be reusable). Likewise on the attribute values, where StAX
> returns Strings. Returning CharSequence-equivalents would not only avoid
> unnecessary String creation (in the case of attribute values), it would
> also eliminate the need to translate the raw byte stream to character
> arrays for common encodings (especially the UTF-8 and UTF-16 used in
> BP-compliant web services).
> 
> Unfortunately, I think developers sometimes misapply Knuth's (or Hoare's
> - I'm not sure who got this started) "premature optimization is the root
> of all evil" aphorism by designing APIs without any thought to
> performance. Once performance bottlenecks have been built into the APIs
> it's very difficult to get around them without scrapping things and
> starting over.
> 
>   - Dennis
> 
> Dan Diephouse wrote:
> 
> > Has anyone done any performance tests (binary or just plan text) with
> > the typed stax stuff? Does it really make a difference?
> > - Dan
> >
> > Eran Chinthaka wrote:
> >
> >> Hi Dennis,
> >>
> >> You have commented on typed pull parser in wiki. Shall we start a thread
> >> about it here ?
> >>
> >> -- EC
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Apache Wiki [mailto:[EMAIL PROTECTED]
> >>> Sent: Thursday, July 28, 2005 10:31 PM
> >>> To: [email protected]
> >>> Subject: [Ws Wiki] Update of
> >>> "FrontPage/Axis2/Tasks/BinarySerialization"
> >>> by DennisSosnoski
> >>>
> >>> Dear Wiki user,
> >>>
> >>> You have subscribed to a wiki page or wiki category on "Ws Wiki" for
> >>> change notification.
> >>>
> >>> The following page has been changed by DennisSosnoski:
> >>> http://wiki.apache.org/ws/FrontPage/Axis2/Tasks/BinarySerialization
> >>>
> >>> --------------------------------------------------------------------------
> >>>
> >>> ----
> >>>  decoding the binary into an int, converting to a string for the parser
> >>>  API and then back to an int in the deserialisation code.
> >>>
> >>> + I (DennisSosnoski) would personally disagree with the above
> >>> assessment.
> >>> A typed pull parser would definitely be nice, but even without this you
> >>> can get substantial size and performance gains from a binary format.
> >>> See
> >>> my articles on devWorks at http://www-
> >>> 128.ibm.com/developerworks/xml/library/x-trans1.html and http://www-
> >>> 128.ibm.com/developerworks/xml/library/x-trans2/index.html for
> >>> examples.
> >>> +
> >>>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
>

Re: [Axis2] Binary Serialisation

Reply via email to