[ 
http://issues.apache.org/jira/browse/DERBY-2106?page=comments#action_12451830 ] 
            
Daniel John Debrunner commented on DERBY-2106:
----------------------------------------------

XML processing says the 'XML processor MUST behave' as though all line endings 
have been converted to a 'single #xA character' (which is '\n')

http://www.w3.org/TR/REC-xml/#sec-line-ends

I would say that means the XML processor in Derby should behave such that 
new-lines are converted to a single '\n' character when using XMLSERIALIZE.

> Improve Derby SQL/XML processing to account for Xalan's use of 
> platform-specific newlines when serializing.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2106
>                 URL: http://issues.apache.org/jira/browse/DERBY-2106
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.2.1.6, 10.3.0.0, 10.2.1.8
>            Reporter: A B
>            Priority: Minor
>
> Derby uses Apache Xalan to serialize XML data values.  As part of the 
> serialization process Xalan converts the newline character ("\n") to a 
> platform-specific line ending.  This conversion of line endings is allowed by 
> XML serialization rules and therefore is not a bug in Xalan--see XALANJ-1137 
> for some discussion along those lines.  That said, though, this particular 
> behavior means that an application which uses Derby to serialize XML values 
> can end up with different characters on different platforms.  And further, 
> since Derby currently writes serialized XML to disk, this means that 
> insertion of an XML value on one platform (such as Windows) can lead to 
> different line-ending characters on disk than insertion of that exact same 
> XML value on another platform (such as Linux).
> Discussion on the derby-dev list seems to indicate (based on lack of comments 
> to the contrary) that this behavior in Derby is not a "bug" per se, but that 
> it might be nice if Derby could somehow account for Xalan's treatment of 
> newlines to provide consistent XML serialization results across platforms.  
> The relevant thread is here:
>   http://thread.gmane.org/gmane.comp.apache.db.derby.devel/33170/focus=33170 
> As indicated in that thread, one simple (but not fully tested) approach is to 
> make a change in the "serializeToString()" method of SqlXmlUtil.java to do an 
> explicit replacement of platform-specific line-endings with a simple newline. 
>  Something like:
> +        String eol = PropertyUtil.getSystemProperty("line.separator");
> +        if (eol != null)
> +            return sWriter.toString().replaceAll(eol, "\n");
>         return sWriter.toString();
> This small change seems to provide consistent results across all platforms, 
> and appears to work correctly even if line-endings are hard-coded in the XML 
> file (ex. if the literal "\r\n" occurs in the XML file, the above code will 
> *not* replace it, which is good).  However, internal modification of 
> user-supplied data is generally a risky proposal, so more testing would be 
> needed for this particular approach.
> Also, any changes to Derby serialization as a part of this issue would need 
> to consider backward-compatibility issues--namely, how would the changes 
> affect XML files that have already been inserted into the database (and 
> therefore that already have platform-specific endings)?  Ideally treatment of 
> existing and new XML data would be consistent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to