[ http://issues.apache.org/jira/browse/DERBY-2106?page=comments#action_12451832 ] Daniel John Debrunner commented on DERBY-2106: ----------------------------------------------
Though I have to say the phrase Army highlighted: "When outputting a newline character in the instance of the data model, the serializer is free to represent it using any character sequence that will be normalized to a newline character by an XML parser" together with the SQL Standard saying that it does not mandate any expression as being deterministic, only "possibly non-deterministic" does allow for Derby to have its current behaviour. Just as confused as I was at the beginning. :-( > Improve Derby SQL/XML processing to account for Xalan's use of > platform-specific newlines when serializing. > ----------------------------------------------------------------------------------------------------------- > > Key: DERBY-2106 > URL: http://issues.apache.org/jira/browse/DERBY-2106 > Project: Derby > Issue Type: Improvement > Components: SQL > Affects Versions: 10.2.1.6, 10.3.0.0, 10.2.1.8 > Reporter: A B > Priority: Minor > > Derby uses Apache Xalan to serialize XML data values. As part of the > serialization process Xalan converts the newline character ("\n") to a > platform-specific line ending. This conversion of line endings is allowed by > XML serialization rules and therefore is not a bug in Xalan--see XALANJ-1137 > for some discussion along those lines. That said, though, this particular > behavior means that an application which uses Derby to serialize XML values > can end up with different characters on different platforms. And further, > since Derby currently writes serialized XML to disk, this means that > insertion of an XML value on one platform (such as Windows) can lead to > different line-ending characters on disk than insertion of that exact same > XML value on another platform (such as Linux). > Discussion on the derby-dev list seems to indicate (based on lack of comments > to the contrary) that this behavior in Derby is not a "bug" per se, but that > it might be nice if Derby could somehow account for Xalan's treatment of > newlines to provide consistent XML serialization results across platforms. > The relevant thread is here: > http://thread.gmane.org/gmane.comp.apache.db.derby.devel/33170/focus=33170 > As indicated in that thread, one simple (but not fully tested) approach is to > make a change in the "serializeToString()" method of SqlXmlUtil.java to do an > explicit replacement of platform-specific line-endings with a simple newline. > Something like: > + String eol = PropertyUtil.getSystemProperty("line.separator"); > + if (eol != null) > + return sWriter.toString().replaceAll(eol, "\n"); > return sWriter.toString(); > This small change seems to provide consistent results across all platforms, > and appears to work correctly even if line-endings are hard-coded in the XML > file (ex. if the literal "\r\n" occurs in the XML file, the above code will > *not* replace it, which is good). However, internal modification of > user-supplied data is generally a risky proposal, so more testing would be > needed for this particular approach. > Also, any changes to Derby serialization as a part of this issue would need > to consider backward-compatibility issues--namely, how would the changes > affect XML files that have already been inserted into the database (and > therefore that already have platform-specific endings)? Ideally treatment of > existing and new XML data would be consistent. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
