Army wrote:
Daniel John Debrunner wrote:

Aren't there two separate (but related) issues here?

 1) What Derby stores on disk (the extra number of characters stored)
2) How Derby serialized XML values back to an application (the test counting issue).

Serialization of XML values can be expensive, so in order to ensure that we only do it once, Derby serializes the data once and writes the serialized version of it to disk. Then when we fetch it back we just read it from disk and return it to the app. So due to the way this is implemented, this is actually just one issue. Maybe that implementation decision needs to be revisited, though...

Seems on output (XMLSERIALIZE) generating the correct line endings for the platform makes sense

But there's no "generation" here because the data is being read directly from disk and returned to the user.

I was thinking more generally in that an XML value may be generated and thus never have been stored to disk. How it is stored on disk and how the XML value is serialized using XMLSERIALIZE() are different operations, it's just an implementation detail of derby that they are the same in some instances.

Would all these operations return the same exact characters to an application if they represent the same logical value?

XMLSERIALIZE(colvalue originally on linux)
XMLSERIALIZE(colvalue originally on windows)
XMLSERIALIZE(generated XML value from other XML operators)

Would it surprise an application to receive different character values for those expressions?

If they are different, does it matter since they are all valid serializations under SQL/XML?

My gut feeling is that different character values would be confusing to an application, but it probably depends what the application is doing with them. Looking at them in notepad would be confusing. :-)

Thinking a little more, having XMLSERIALIZE() (within an given runtime) being non-deterministic seems wrong.

Dan.


Reply via email to