Hi Chap, Thanks for the thorough explanation!
On 20.01.25 20:09, Chapman Flack wrote: >> PostgreSQL does not support the RETURNING SEQUENCE or RETURNING CONTENT >> clauses explicitly. Instead, it implicitly uses RETURNING CONTENT[2] in >> functions that require it. Since RETURNING CONTENT implies that the >> output is a well-formed XML document (e.g., single-rooted), > In fact, you can't infer single-root-element-ness from RETURNING CONTENT, > according to the standard. Single-root-element-ness is checked by the > IS DOCUMENT predicate, and by XMLPARSE and XMLSERIALIZE when they specify > DOCUMENT. But it isn't checked or implied by the XMLDOCUMENT constructor. > > That amounts to a bit of unfortunate punning on the word DOCUMENT, > but so help me that's what's in the standard. Yeah, the term DOCUMENT seems a bit misleading in this context. > > It may help to think in terms of the hierarchy of XML types that the > 2006 standard introduced (cribbed here from [3]): > > SEQUENCE > | > (?sequence of length 1, a document node) > | > CONTENT(ANY)----------------.----------------(?every element > | | conforms to a > (?every element has (?no extraneous schema) > xdt:untyped and !nilled, nodes) | > every attribute has | | > xdt:untypedAtomic) DOCUMENT(ANY) CONTENT(XMLSCHEMA) > | | > CONTENT(UNTYPED) (?whole thing is valid > | according to schema) > (?no extraneous nodes) | > | DOCUMENT(XMLSCHEMA) > DOCUMENT(UNTYPED) > > where the condition (?no extraneous nodes) is shorthand for SQL/XML's > more precise "whose `children` property has exactly one XQuery element > node, zero or more XQuery comment nodes, and zero or more XQuery > processing instruction nodes". > > So that (?no extraneous nodes) condition is required for any of > the XML(DOCUMENT...) types. When you relax that condition, you have > an XML(CONTENT...) type. > > The XMLDOCUMENT constructor is so named because it constructs what > corresponds to an XQuery document node—which actually corresponds to > the XML(CONTENT...) SQL/XML types, and does not enforce having a > single root element: > > "This data model is more permissive: a Document Node may be empty, > it may have more than one Element Node as a child, and it also > permits Text Nodes as children."[4] Thanks a lot for pointing that out! I guess it's clear now. > > So in terms of the SQL/XML type hierarchy, what you get back from > XMLDOCUMENT ... RETURNING CONTENT will have one of the XML(CONTENT...) > types (whether it's CONTENT(ANY) or CONTENT(UNTYPED) is left to the > implementation). > > If you then want to know if it is single-rooted, you can apply the > IS DOCUMENT predicate, or try to cast it to an XML(DOCUMENT...) type. > > (And if you use XMLDOCUMENT ... RETURNING SEQUENCE, then you get a > value of type XML(SEQUENCE). The sequence has length 1, a document > node, making it safely castable to XML(CONTENT(ANY)), but whether > you can cast it to an XML(DOCUMENT...) type will depend on what > children that document node has.) > > Long story short, an XMLDOCUMENT constructor that enforced having > a single root element would be nonconformant. > If I understand correctly, the compliant approach would be to always treat the input expression as CONTENT: |PG_RETURN_XML_P(xmlparse((text *) data, XMLOPTION_DOCUMENT, true));| Is that right?" > >> 1 - https://www.ibm.com/docs/en/db2/11.1?topic=constructors-document-node >> 2 - https://www.postgresql.org/docs/17/xml-limits-conformance.html > 3 - > https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#SQL.2FXML:2003_contrasted_with_SQL.2FXML_since_2006 > 4 - https://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/#DocumentNode > Best, Jim