Re: [HACKERS] Bug in XPATH() produces invalid XML values and probably un-restorable dumps

2011-06-09 Thread Florian Pflug
On May31, 2011, at 20:50 , Florian Pflug wrote:
> While trying to figure out sensible semantics for XPATH() and scalar-value 
> returning XPath expressions, I've stumbled upon a bug in XPATH() that allows 
> invalid XML values to be produced. This is a serious problem because should 
> such invalid values get inserted into an XML column, an un-restorable dump 
> ensues.
> 
> Here's an example (REL9_0_STABLE as of a few days ago)
> 
> template1=# SELECT (XPATH('/*/text()', '<'))[1];
> xpath 
> ---
> <
> 
> ...
> 
> Patch is attached.

I've added two tests to the xml regression test which highlight the issue.

Updated patch attached.

best regards,
Florian Pflug


pg_xpath_invalidxml.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bug in XPATH() produces invalid XML values and probably un-restorable dumps

2011-05-31 Thread Florian Pflug
Hi

While trying to figure out sensible semantics for XPATH() and scalar-value 
returning XPath expressions, I've stumbled upon a bug in XPATH() that allows 
invalid XML values to be produced. This is a serious problem because should 
such invalid values get inserted into an XML column, an un-restorable dump 
ensues.

Here's an example (REL9_0_STABLE as of a few days ago)

template1=# SELECT (XPATH('/*/text()', '<'))[1];
 xpath 
---
 <

Since XPATH() returns XML[], this value has type XML, but clearly isn't 
well-formed. And behold, casting to TEXT and back to XML complains loudly.

template1=# SELECT (XPATH('/*/text()', '<'))[1]::TEXT::XML;
ERROR:  invalid XML content
DETAIL:  Entity: line 1: parser error : StartTag: invalid element name
<
 ^

The culprit is xml_xmlnodetoxmltype() in backend/utils/adt/xml.c. For 
non-element nodes, it returns the result of xmlXPathCastNodeToString() 
verbatim, even though that function doesn't reverse the entity replacement that 
was done during parsing. Adding a call to escape_xml() 
for non-element nodes fixes the problem

template1=# SELECT (XPATH('/*/text()', '<'))[1];
 xpath 
---
 <

Patch is attached.

best regards,
Florian Pflug


pg_xpath_invalidxml.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers