On Dec23, 2013, at 03:45 , Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, Dec 20, 2013 at 8:16 PM, Florian Pflug <f...@phlo.org> wrote:
>> On Dec20, 2013, at 18:52 , Robert Haas <robertmh...@gmail.com> wrote:
>>> On Thu, Dec 19, 2013 at 6:40 PM, Florian Pflug <f...@phlo.org> wrote:
>>>> Solving this seems a bit messy, unfortunately. First, I think we need to 
>>>> have some XMLOPTION value which is a superset of all the others - 
>>>> otherwise, dump & restore won't work reliably. That means either allowing 
>>>> DTDs if XMLOPTION is CONTENT, or inventing a third XMLOPTION, say ANY.
>>> 
>>> Or we can just decide that it was a bug that this was ever allowed,
>>> and if you upgrade to $FIXEDVERSION you'll need to sanitize your data.
>>> This is roughly what we did with encoding checks.
>> 
>> What exactly do you suggest we outlaw?
> 
> <!DOCTYPE> anywhere but at the beginning.

I think we're talking past one another here. Fixing XMLCONCAT/XMLAGG
to not produce XML values which are neither valid DOCUMENTS nor valid
CONTENT fixes *one* part of the problem.

The other part of the problem is that since not every DOCUMENT
is valid CONTENT (because CONTENT forbids DTDs) and not every CONTENT
is a valid DOCUMENT (because DOCUMENT forbids multiple root nodes), it's
impossible to set XMLOPTION to a value which accepts *all* valid XML
values. That breaks pg_dump/pg_restore. To fix this, we must provide
a way to insert XML data which accepts both DOCUMENTS and CONTENT, and
not only one or the other. Due to the way COPY works, we cannot call
a special conversion function, so we must modify the input functions.

My initial thought was to simply allow XML values which are CONTENT,
not DOCUMENTS, to contain a DTD (at the beginning), thus making CONTENT
a superset of DOCUMENT. But I've since then realized that the 2003
standard explicitly constrains CONTENT to *not* contain a DTD. The
only other option that I can see is to invert a third, non-standard
XMLOPTION value, ANY. ANY would accept anything accepted by either
DOCUMENT or CONTENT, but no more than that.

best regards,
Florian Pflug






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to