Army wrote:
Daniel John Debrunner wrote:
Aren't there two separate (but related) issues here?
1) What Derby stores on disk (the extra number of characters stored)
2) How Derby serialized XML values back to an application (the test
counting issue).
Serialization of XML values can be expensive, so in order to ensure that
we only do it once, Derby serializes the data once and writes the
serialized version of it to disk. Then when we fetch it back we just
read it from disk and return it to the app. So due to the way this is
implemented, this is actually just one issue. Maybe that implementation
decision needs to be revisited, though...
Seems on output (XMLSERIALIZE) generating the correct line endings for
the platform makes sense
But there's no "generation" here because the data is being read directly
from disk and returned to the user.
I was thinking more generally in that an XML value may be generated and
thus never have been stored to disk. How it is stored on disk and how
the XML value is serialized using XMLSERIALIZE() are different
operations, it's just an implementation detail of derby that they are
the same in some instances.
Would all these operations return the same exact characters to an
application if they represent the same logical value?
XMLSERIALIZE(colvalue originally on linux)
XMLSERIALIZE(colvalue originally on windows)
XMLSERIALIZE(generated XML value from other XML operators)
Would it surprise an application to receive different character values
for those expressions?
If they are different, does it matter since they are all valid
serializations under SQL/XML?
My gut feeling is that different character values would be confusing to
an application, but it probably depends what the application is doing
with them. Looking at them in notepad would be confusing. :-)
Thinking a little more, having XMLSERIALIZE() (within an given runtime)
being non-deterministic seems wrong.
Dan.