2017-03-12 0:56 GMT+01:00 Noah Misch <n...@leadboat.com>:

> On Mon, Feb 20, 2017 at 07:48:18PM +0100, Pavel Stehule wrote:
> > Today I played with xml_recv function and with xml processing functions.
> >
> > xml_recv function ensures correct encoding from document encoding to
> server
> > encoding. But the decl section holds original encoding info - that should
> > be obsolete after encoding. Sometimes we solve this issue by removing
> decl
> > section - see the xml_out function.
> >
> > Sometimes we don't do it - lot of functions uses direct conversion from
> > xmltype to xmlChar.
>
> > There are possible two fixes
> >
> > a) clean decl on input - the encoding info can be removed from decl part
> >
> > b) use xml_out_internal everywhere before transformation to
> > xmlChar. pg_xmlCharStrndup can be good candidate.
>
> I'd prefer (a) if the xml type were a new feature, because no good can
> come of
> storing an encoding in each xml field when we know the actual encoding is
> the
> database encoding.  However, if you implemented (a), we'd still see
> untreated
> values brought over via pg_upgrade.  Therefore, I would try (b) first.  I
> suspect the intent of xml_parse() was to implement (b); it will be
> interesting
> to see your test case that malfunctions.
>

I looked there again and I found so this issue is related to xpath function
only

Functions based on xml_parse are working without problems. xpath_internal
uses own direct xmlCtxtReadMemory without correct encoding sanitation.

so fix is pretty simple

 diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c

index f81cf489d2..89aae48cb3 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3874,9 +3874,11 @@ xpath_internal(text *xpath_expr_text, xmltype *data,
ArrayType *namespaces,
        ns_count = 0;
    }

-   datastr = VARDATA(data);
-   len = VARSIZE(data) - VARHDRSZ;
+   datastr = xml_out_internal(data, 0);
+   len = strlen(datastr);
+
    xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+
    if (xpath_len == 0)
        ereport(ERROR,
                (errcode(ERRCODE_DATA_EXCEPTION),

Regards

Pavel

Reply via email to