[ 
https://issues.apache.org/jira/browse/JCR-1261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549976
 ] 

fmeschbe edited comment on JCR-1261 at 12/10/07 1:09 AM:
------------------------------------------------------------------

I just ran the litmus tests showing that all but one tests succeeded or failed 
exactly as when using Xerces. The single failing test, which succeeds for 
Xerces is test 18 (propget) of the "props" testsuite. In fact also test 17 
(prophighunicode) fails internally.

The problem is as follows: test 17 of the props testsuite sends a property 
value containing the high Unicode character 0x10000 (decimal 65535) which must 
be represented as a surrogate pair in UTF-16 (used by Java). Both KXml2 and 
MXP1 incorrectly handle high unicode characters while Xerces correctly handles 
them. The result is that test 17 will store (try to store) a single character 
string whose character value is NUL (0x0).

Test 18 will try to send this value back, which results in different errors: 
KXml2 encodes this NUL as � which causes litmus (or neon) to fail parsing 
the result. MXP1 on the other hand already fail serializing this NUL (which is 
probably more correct as � is not a valid XML character reference).

I have already pinged the developer of KXml2 regarding this issue and chances 
of getting a fix (given I provide a patch).

In the meantime the question is, whether this (IMHO minor) issue should stop us 
from this change given the other advantages in performance and memory footprint.

As a matter of fact: Java 1.4 does not have correct high Unicode value built 
into the system. Only Java 1.5 brings support for this.

I weigh the performance and memory footprint advantages higher than this 
unlikely issue of encountering a high Unicode character and given Java 1.4 lack 
of full high Unicode character support.

      was (Author: fmeschbe):
    I just ran the litmus tests showing that all but one tests succeeded or 
failed exactly as when using Xerces. The single failing test, which succeeds 
for Xerces is test 18 (propget) of the "props" testsuite. In fact also test 17 
(prophighunicode) fails internally.

The problem is as follows: test 17 of the props testsuite sends a property 
value containing the high Unicode character 0x10000 (decimal 65535) which must 
be represented as a surrogate pair in UTF-16 (used by Java). Both KXml2 and 
MXP1 incorrectly handle high unicode characters while Xerces correctly handles 
them. The result is that test 17 will store (try to store) a single character 
string whose character value is NUL (0x0).

Test 18 will try to send this value back, which results in different errors: 
KXml2 encodes this NUL as � which causes litmus (or neon) to fail parsing 
the result. MXP1 on the other hand already fail serializing this NUL (which is 
probably more correct as � is not a valid XML character reference).

I have already pinged the developer of KXml2 regarding this issue and chances 
of getting a fix (given I provide a patch).

In the meantime the question is, whether this (IMHO minor) issue should stop us 
from this change given the other advantages in performance and memory footprint.
  
> Replace Xerces/JAXP based (un)marshalling by XPP3 based implementation
> ----------------------------------------------------------------------
>
>                 Key: JCR-1261
>                 URL: https://issues.apache.org/jira/browse/JCR-1261
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-webdav
>    Affects Versions: 1.3.3
>            Reporter: Felix Meschberger
>         Attachments: JCR-1261.patch
>
>
> Proposing to replace the current Xerces/JAXP based implementation of XML data 
> unmarshalling and marshalling to be replaced by a MXP ([1]) based 
> implementation. MXP is an implementation of the XML PullParser API [2] and 
> provides a very mall footprint (120KB) and far better performance than Xerces 
> (my tests showed around 50% performance increase over Xerces for both 
> unmarshalling and marshalling, see also [3]).
> Why do I care ? I would like to include WebDAV functionality into Sling and 
> bundle is with as little dependencies as possible ( the xpp3 lib might even 
> be bundled into this project to minimize dependencies). In addition, I think 
> XML (un)marshalling performance is crucial in any WebDAV solution. For this 
> reason, this project can only win if we add a more performing library.
> [1] http://www.extreme.indiana.edu/xgws/xsoap/xpp/mxp1/
> [2] http://www.xmlpull.org
> [3] http://www.extreme.indiana.edu/~aslom/xpp_sax2bench/results.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to