subject:"PostgreSQL vs SQL\/XML Standards"

Re: PostgreSQL vs SQL/XML Standards

2019-06-01 Thread Chapman Flack

On 05/31/19 16:04, Alvaro Herrera wrote:

> https://www.iso.org/standard/41528.html "XQuery Regular Expression
> Support in SQL"

Although I hadn't seen that particular document, I did see those in the
SQL spec and mention them in the wiki page [1].

I should point out that's also a conformance note in the new SQL/JSON
support [2], as the standard specifies XQuery regular expressions for
jsonpath's like_regex predicate, where we're applying POSIX ones
instead.

Doesn't change my thinking much ... I think we should support the
stuff. :) ... using a non-in-house-developed XQuery library to do it,
or better yet, a choice of non-in-house-developed XQuery libraries.

I still think the second idea I suggested on the wiki seems like the
most promising road map [3], though it'll involve a few preparatory
stops along the way.

-Chap

[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#XML_Query_regular_expressions

[2]
https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md#sqljson-conformance

[3]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Proposal_2:_desugaring_to_calls_on_normal_extension_functions

Re: PostgreSQL vs SQL/XML Standards

2019-05-31 Thread Alvaro Herrera

On 2018-Oct-24, Chapman Flack wrote:

> Inspired by the wiki page on PostgreSQL vs SQL Standard in general,
> I have made another wiki page specifically about $subject. I hope
> this was not presumptuous, and invite review / comment. I have not
> linked to it from any other page yet.

In the SQL Standards session at the Unconference, I found out that the
committee produced this technical report in 2011:
https://www.iso.org/standard/41528.html "XQuery Regular Expression
Support in SQL"; it furthers our lack of support for XQuery in our
implementation SQL/XML.  That content is probably relevant for this
topic, even if we cannot do much about it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2019-03-14 Thread Alvaro Herrera

On 2019-Mar-14, Pavel Stehule wrote:

> I looking to code
> 
> void(*nodefree) (xmlNodePtr) = NULL;
> volatile xmlBufferPtr buf = NULL;
> 
> should not be "nodefree" volatile too?

Ah, good question.  I remember I had it volatile and removed it for some
reason, though I don't remember why.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2019-03-14 Thread Pavel Stehule

Hi

pá 8. 3. 2019 v 19:44 odesílatel Alvaro Herrera 
napsal:

> On 2019-Mar-08, Alvaro Herrera wrote:
>
> > > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> > >
> > > some like
> > >
> > > if (cur_copy->type == XML_DOCUMENT_NODE)
> > >   xmlFreeDoc((xmlDocPtr) cur_copy);
> > > else
> > >   xmlFreeNode(cur_copy);
> > >
> > > This looks most correct fix for me. What do you think?
> >
> > Seems like that should work, yeah ...
>
> Something like this perhaps?  Less repetitive ...
>

I looking to code

void(*nodefree) (xmlNodePtr) = NULL;
volatile xmlBufferPtr buf = NULL;

should not be "nodefree" volatile too?

Pavel


>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Pavel Stehule

pá 8. 3. 2019 v 19:48 odesílatel Pavel Stehule 
napsal:

>
>
> pá 8. 3. 2019 v 19:44 odesílatel Alvaro Herrera 
> napsal:
>
>> On 2019-Mar-08, Alvaro Herrera wrote:
>>
>> > > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
>> > >
>> > > some like
>> > >
>> > > if (cur_copy->type == XML_DOCUMENT_NODE)
>> > >   xmlFreeDoc((xmlDocPtr) cur_copy);
>> > > else
>> > >   xmlFreeNode(cur_copy);
>> > >
>> > > This looks most correct fix for me. What do you think?
>> >
>> > Seems like that should work, yeah ...
>>
>> Something like this perhaps?  Less repetitive ...
>>
>
Thank you for commit.

the commit message is not correct. xmlCopyNodes does owns works well, but
the node is broken already, and because we should to call xmlFreeNode, we
have a problem.

regards

Pavel


> +1
>
> Pavel
>
>
>> --
>> Álvaro Herrerahttps://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Pavel Stehule

pá 8. 3. 2019 v 19:44 odesílatel Alvaro Herrera 
napsal:

> On 2019-Mar-08, Alvaro Herrera wrote:
>
> > > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> > >
> > > some like
> > >
> > > if (cur_copy->type == XML_DOCUMENT_NODE)
> > >   xmlFreeDoc((xmlDocPtr) cur_copy);
> > > else
> > >   xmlFreeNode(cur_copy);
> > >
> > > This looks most correct fix for me. What do you think?
> >
> > Seems like that should work, yeah ...
>
> Something like this perhaps?  Less repetitive ...
>

+1

Pavel


> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Alvaro Herrera

On 2019-Mar-08, Alvaro Herrera wrote:

> > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> > 
> > some like
> > 
> > if (cur_copy->type == XML_DOCUMENT_NODE)
> >   xmlFreeDoc((xmlDocPtr) cur_copy);
> > else
> >   xmlFreeNode(cur_copy);
> > 
> > This looks most correct fix for me. What do you think?
> 
> Seems like that should work, yeah ...

Something like this perhaps?  Less repetitive ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 28b3eaaa201..9204d6e9cf6 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3720,35 +3720,57 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 
 	if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
 	{
-		xmlBufferPtr buf;
-		xmlNodePtr	cur_copy;
-
-		buf = xmlBufferCreate();
-
-		/*
-		 * The result of xmlNodeDump() won't contain namespace definitions
-		 * from parent nodes, but xmlCopyNode() duplicates a node along with
-		 * its required namespace definitions.
-		 */
-		cur_copy = xmlCopyNode(cur, 1);
-
-		if (cur_copy == NULL)
-			xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
-		"could not copy node");
+		volatile void (*nodefree) (xmlNodePtr) = NULL;
+		volatile xmlBufferPtr buf = NULL;
+		volatile xmlNodePtr cur_copy = NULL;
 
 		PG_TRY();
 		{
-			xmlNodeDump(buf, NULL, cur_copy, 0, 1);
+			int			bytes;
+
+			buf = xmlBufferCreate();
+			if (buf == NULL || xmlerrcxt->err_occurred)
+xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+			"could not allocate xmlBuffer");
+
+			/*
+			 * Produce a dump of the node that we can serialize.  xmlNodeDump
+			 * does that, but the result of that function won't contain
+			 * namespace definitions from ancestor nodes, so we first do a
+			 * xmlCopyNode() which duplicates the node along with its required
+			 * namespace definitions.
+			 *
+			 * Some old libxml2 versions such as 2.7.6 produce partially
+			 * broken XML_DOCUMENT_NODE nodes (unset content field) when
+			 * copying them.  xmlNodeDump of such a node works fine, but
+			 * xmlFreeNode crashes; set us up to call xmlFreeDoc instead.
+			 */
+			cur_copy = xmlCopyNode(cur, 1);
+			if (cur_copy == NULL || xmlerrcxt->err_occurred)
+xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+			"could not copy node");
+			nodefree = cur_copy->type == XML_DOCUMENT_NODE ?
+(void (*) (xmlNodePtr)) xmlFreeDoc : xmlFreeNode;
+
+			bytes = xmlNodeDump(buf, NULL, cur_copy, 0, 1);
+			if (bytes == -1 || xmlerrcxt->err_occurred)
+xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+			"could not dump node");
+
 			result = xmlBuffer_to_xmltype(buf);
 		}
 		PG_CATCH();
 		{
-			xmlFreeNode(cur_copy);
-			xmlBufferFree(buf);
+			if (nodefree)
+nodefree(cur_copy);
+			if (buf)
+xmlBufferFree(buf);
 			PG_RE_THROW();
 		}
 		PG_END_TRY();
-		xmlFreeNode(cur_copy);
+
+		if (nodefree)
+			nodefree(cur_copy);
 		xmlBufferFree(buf);
 	}
 	else

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Alvaro Herrera

(I spent some time trying to reproduce the original bug, but was
interrupted for lunch before getting a useful installation.  I find it a
bit strange that it doesn't crash in x86_64, mind ...)

On 2019-Mar-08, Pavel Stehule wrote:

> It fixes current issue, but I afraid so these two routines are not
> replaceable. xmlFreeNodeList doesn't release xmlFreeDtd, XML_ATTRIBUTE_NODE
> is not checked.

:-(

> Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> 
> some like
> 
> if (cur_copy->type == XML_DOCUMENT_NODE)
>   xmlFreeDoc((xmlDocPtr) cur_copy);
> else
>   xmlFreeNode(cur_copy);
> 
> This looks most correct fix for me. What do you think?

Seems like that should work, yeah ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Pavel Stehule

pá 8. 3. 2019 v 15:31 odesílatel Alvaro Herrera 
napsal:

> On 2019-Mar-08, Pavel Stehule wrote:
>
> > looks like error in xmlXPathCompiledEval function, that produce little
> bit
> > broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> > libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
> >
> > Currently I had not fresh 32 bit system to check it.
> >
> > I found a workaround - in this case copy (and release xmlNode) is not
> > necessary.
>
> Hmm ... going over the libxml2 2.7.6 source, I noticed that
> xmlFreeNodeList seem to get this right -- it uses xmlFreeDoc for
> XML_DOCUMENT_NODE.  Maybe a sufficient answer is to change the
> xmlFreeNode there to xmlFreeNodeList.
>

It fixes current issue, but I afraid so these two routines are not
replaceable. xmlFreeNodeList doesn't release xmlFreeDtd, XML_ATTRIBUTE_NODE
is not checked.

You can see, from xmlNodeGetContent, XML_DOCUMENT_NODE type should to
ignore content value, and newer returns this value.  Other interesting is
xmlXPathOrderDocElems where content is used for counting, and probably from
there is -1.

Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode

some like

if (cur_copy->type == XML_DOCUMENT_NODE)
  xmlFreeDoc((xmlDocPtr) cur_copy);
else
  xmlFreeNode(cur_copy);

This looks most correct fix for me. What do you think?

Pavel



> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Alvaro Herrera

On 2019-Mar-08, Pavel Stehule wrote:

> looks like error in xmlXPathCompiledEval function, that produce little bit
> broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
> 
> Currently I had not fresh 32 bit system to check it.
> 
> I found a workaround - in this case copy (and release xmlNode) is not
> necessary.

Hmm ... going over the libxml2 2.7.6 source, I noticed that
xmlFreeNodeList seem to get this right -- it uses xmlFreeDoc for
XML_DOCUMENT_NODE.  Maybe a sufficient answer is to change the
xmlFreeNode there to xmlFreeNodeList.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Pavel Stehule

pá 8. 3. 2019 v 13:20 odesílatel Alvaro Herrera 
napsal:

> On 2019-Mar-08, Pavel Stehule wrote:
>
> > looks like error in xmlXPathCompiledEval function, that produce little
> bit
> > broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> > libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
> >
> > Currently I had not fresh 32 bit system to check it.
> >
> > I found a workaround - in this case copy (and release xmlNode) is not
> > necessary.
> >
> > please, apply attached patch.
>
> Wow :-(  At this point I'm wondering if this should be put in back
> branches as well ... I mean, distill part of commit 251cf2e27bec that
> doesn't affect the behavior of text nodes, and put it on all branches
> together with your fix?
>

The problem is just for case result: XML_DOCUMENT_TYPE, target XML. For
this case the previously used transformation doesn't work.

Is not problem to detect this situation. The content field has -1 instead
0.

Originally there was inverted logic, so xmlCopyNode and xmlFreeNode was not
used for XML_DOCUMENT_TYPE, and then we didn't hit this bug.


> Another thought: should we refuse to work on known-broken libxml2
> versions?  Seems like this bug could affect other parts of code too -- I
> see that xmlXPathCompiledEval() is called in file places (including two
> in contrib/xml2).
>
> Third thought: an alternative might be to create a wrapper for
> xmlXPathCompiledEval that detects NULL content and fills in a pointer
> that xmlFreeNode can free.
>



>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Alvaro Herrera

On 2019-Mar-08, Pavel Stehule wrote:

> looks like error in xmlXPathCompiledEval function, that produce little bit
> broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
> 
> Currently I had not fresh 32 bit system to check it.
> 
> I found a workaround - in this case copy (and release xmlNode) is not
> necessary.
> 
> please, apply attached patch.

Wow :-(  At this point I'm wondering if this should be put in back
branches as well ... I mean, distill part of commit 251cf2e27bec that
doesn't affect the behavior of text nodes, and put it on all branches
together with your fix?

Another thought: should we refuse to work on known-broken libxml2
versions?  Seems like this bug could affect other parts of code too -- I
see that xmlXPathCompiledEval() is called in file places (including two
in contrib/xml2).

Third thought: an alternative might be to create a wrapper for
xmlXPathCompiledEval that detects NULL content and fills in a pointer
that xmlFreeNode can free.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Pavel Stehule

pá 8. 3. 2019 v 7:41 odesílatel Pavel Stehule 
napsal:

> Hi
>
> pá 8. 3. 2019 v 3:44 odesílatel Alvaro Herrera 
> napsal:
>
>> On 2019-Mar-07, Alvaro Herrera wrote:
>>
>> > On 2019-Feb-11, Chapman Flack wrote:
>> >
>> > > xmltable-xpath-result-processing-bugfix-6.patch includes a
>> regress/expected
>> > > output for the no-libxml case that was left out of -5.
>> >
>> > Pushed this one, with some trivial changes: I renamed and relocated
>> > Pavel's function to strdup-and-free and fused the new test cases to use
>> > less queries for the same functionality.  Naturally I had to adjust the
>> > expected files ... I tried to do my best but there's always a little
>> > something that sneaks under one's nose.
>>
>> So we now have a double-free bug here or something ...  Too tired right
>> now to do anything about it.
>>
>>
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02
>>
>> == stack trace:
>> pgsql.build/src/test/regress/tmp_check/data/core ==
>> [New LWP 20275]
>>
>> warning: Can't read pathname for load map: Input/output error.
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library
>> "/lib/arm-linux-gnueabihf/libthread_db.so.1".
>> Core was generated by `postgres: pgbuildfarm regression [local] SELECT
>>'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
>> #0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
>> #1  0x76e28380 in xmlFreeNode () from
>> /usr/lib/arm-linux-gnueabihf/libxml2.so.2
>> #2  0x00481f94 in xml_xmlnodetoxmltype (cur=,
>> xmlerrcxt=) at xml.c:3751
>> #3  0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404,
>> typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
>> #4  0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at
>> nodeTableFuncscan.c:489
>> #5  tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at
>> nodeTableFuncscan.c:318
>> #6  0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
>> #7  0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 ,
>> accessMtd=0x26db1c , node=0x14a83e0) at execScan.c:93
>> #8  ExecScan (node=0x14a83e0, accessMtd=0x26db1c ,
>> recheckMtd=0x23e640 ) at execScan.c:143
>> #9  0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
>> #10 0x00235630 in ExecProcNode (node=0x14a83e0) at
>> ../../../src/include/executor/executor.h:241
>> #11 ExecutePlan (execute_once=, dest=0x14b8d08,
>> direction=, numberTuples=0, sendTuples=,
>> operation=CMD_SELECT, use_parallel_mode=,
>> planstate=0x14a83e0, estate=0x14a82a0) at execMain.c:1643
>> #12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=,
>> count=0, execute_once=true) at execMain.c:362
>> #13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=> out>, count=0, dest=) at pquery.c:929
>> #14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=> out>, run_once=, dest=0x14b8d08, altdest=0x14b8d08,
>> completionTag=0x7ea42414 "") at pquery.c:770
>> #15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at
>> postgres.c:1215
>> #16 0x00393e8c in PostgresMain (argc=, argv=> out>, dbname=0x0, username=) at postgres.c:4256
>> #17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
>> #18 BackendStartup (port=0x13967b8) at postmaster.c:4090
>> #19 ServerLoop () at postmaster.c:1703
>> #20 0x0031607c in PostmasterMain (argc=, argv=> out>) at postmaster.c:1376
>> #21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228
>>
>
>
looks like error in xmlXPathCompiledEval function, that produce little bit
broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.

Currently I had not fresh 32 bit system to check it.

I found a workaround - in this case copy (and release xmlNode) is not
necessary.

please, apply attached patch.


> Pavel
>
>
>>
>> --
>> Álvaro Herrerahttps://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 28b3eaaa20..41145e697a 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3720,35 +3720,58 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 
 	if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
 	{
-		xmlBufferPtr buf;
-		xmlNodePtr	cur_copy;
+		volatile xmlBufferPtr buf = NULL;
+		volatile xmlNodePtr	cur_copy = NULL;
 
-		buf = xmlBufferCreate();
+		PG_TRY();
+		{
+			int		bytes;
 
-		/*
-		 * The result of xmlNodeDump() won't contain namespace definitions
-		 * from parent nodes, but xmlCopyNode() duplicates a node along with
-		 * its required namespace definitions.
-		 */
-		cur_copy = xmlCopyNode(cur, 1);
+			buf = xmlBufferCreate();
 
-		if (cur_copy == NULL)
-			xml_ereport(xmlerrcxt, ERROR,

Re: PostgreSQL vs SQL/XML Standards

2019-03-08 Thread Pavel Stehule

pá 8. 3. 2019 v 3:44 odesílatel Alvaro Herrera 
napsal:

> On 2019-Mar-07, Alvaro Herrera wrote:
>
> > On 2019-Feb-11, Chapman Flack wrote:
> >
> > > xmltable-xpath-result-processing-bugfix-6.patch includes a
> regress/expected
> > > output for the no-libxml case that was left out of -5.
> >
> > Pushed this one, with some trivial changes: I renamed and relocated
> > Pavel's function to strdup-and-free and fused the new test cases to use
> > less queries for the same functionality.  Naturally I had to adjust the
> > expected files ... I tried to do my best but there's always a little
> > something that sneaks under one's nose.
>
> So we now have a double-free bug here or something ...  Too tired right
> now to do anything about it.
>
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02
>
> == stack trace:
> pgsql.build/src/test/regress/tmp_check/data/core ==
> [New LWP 20275]
>
> warning: Can't read pathname for load map: Input/output error.
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library
> "/lib/arm-linux-gnueabihf/libthread_db.so.1".
> Core was generated by `postgres: pgbuildfarm regression [local] SELECT
>'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
> #0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
> #1  0x76e28380 in xmlFreeNode () from
> /usr/lib/arm-linux-gnueabihf/libxml2.so.2
> #2  0x00481f94 in xml_xmlnodetoxmltype (cur=,
> xmlerrcxt=) at xml.c:3751
> #3  0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404,
> typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
> #4  0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at
> nodeTableFuncscan.c:489
> #5  tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at
> nodeTableFuncscan.c:318
> #6  0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
> #7  0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 ,
> accessMtd=0x26db1c , node=0x14a83e0) at execScan.c:93
> #8  ExecScan (node=0x14a83e0, accessMtd=0x26db1c ,
> recheckMtd=0x23e640 ) at execScan.c:143
> #9  0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
> #10 0x00235630 in ExecProcNode (node=0x14a83e0) at
> ../../../src/include/executor/executor.h:241
> #11 ExecutePlan (execute_once=, dest=0x14b8d08,
> direction=, numberTuples=0, sendTuples=,
> operation=CMD_SELECT, use_parallel_mode=,
> planstate=0x14a83e0, estate=0x14a82a0) at execMain.c:1643
> #12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=,
> count=0, execute_once=true) at execMain.c:362
> #13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward= out>, count=0, dest=) at pquery.c:929
> #14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel= out>, run_once=, dest=0x14b8d08, altdest=0x14b8d08,
> completionTag=0x7ea42414 "") at pquery.c:770
> #15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at
> postgres.c:1215
> #16 0x00393e8c in PostgresMain (argc=, argv= out>, dbname=0x0, username=) at postgres.c:4256
> #17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
> #18 BackendStartup (port=0x13967b8) at postmaster.c:4090
> #19 ServerLoop () at postmaster.c:1703
> #20 0x0031607c in PostmasterMain (argc=, argv= out>) at postmaster.c:1376
> #21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228
>

I am able to emulate it on 32bit old scientific linux. So I hope I find fix

>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: PostgreSQL vs SQL/XML Standards

2019-03-07 Thread Pavel Stehule

Hi

pá 8. 3. 2019 v 3:44 odesílatel Alvaro Herrera 
napsal:

> On 2019-Mar-07, Alvaro Herrera wrote:
>
> > On 2019-Feb-11, Chapman Flack wrote:
> >
> > > xmltable-xpath-result-processing-bugfix-6.patch includes a
> regress/expected
> > > output for the no-libxml case that was left out of -5.
> >
> > Pushed this one, with some trivial changes: I renamed and relocated
> > Pavel's function to strdup-and-free and fused the new test cases to use
> > less queries for the same functionality.  Naturally I had to adjust the
> > expected files ... I tried to do my best but there's always a little
> > something that sneaks under one's nose.
>
> So we now have a double-free bug here or something ...  Too tired right
> now to do anything about it.
>
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02
>
> == stack trace:
> pgsql.build/src/test/regress/tmp_check/data/core ==
> [New LWP 20275]
>
> warning: Can't read pathname for load map: Input/output error.
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library
> "/lib/arm-linux-gnueabihf/libthread_db.so.1".
> Core was generated by `postgres: pgbuildfarm regression [local] SELECT
>'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
> #0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
> #1  0x76e28380 in xmlFreeNode () from
> /usr/lib/arm-linux-gnueabihf/libxml2.so.2
> #2  0x00481f94 in xml_xmlnodetoxmltype (cur=,
> xmlerrcxt=) at xml.c:3751
> #3  0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404,
> typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
> #4  0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at
> nodeTableFuncscan.c:489
> #5  tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at
> nodeTableFuncscan.c:318
> #6  0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
> #7  0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 ,
> accessMtd=0x26db1c , node=0x14a83e0) at execScan.c:93
> #8  ExecScan (node=0x14a83e0, accessMtd=0x26db1c ,
> recheckMtd=0x23e640 ) at execScan.c:143
> #9  0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
> #10 0x00235630 in ExecProcNode (node=0x14a83e0) at
> ../../../src/include/executor/executor.h:241
> #11 ExecutePlan (execute_once=, dest=0x14b8d08,
> direction=, numberTuples=0, sendTuples=,
> operation=CMD_SELECT, use_parallel_mode=,
> planstate=0x14a83e0, estate=0x14a82a0) at execMain.c:1643
> #12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=,
> count=0, execute_once=true) at execMain.c:362
> #13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward= out>, count=0, dest=) at pquery.c:929
> #14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel= out>, run_once=, dest=0x14b8d08, altdest=0x14b8d08,
> completionTag=0x7ea42414 "") at pquery.c:770
> #15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at
> postgres.c:1215
> #16 0x00393e8c in PostgresMain (argc=, argv= out>, dbname=0x0, username=) at postgres.c:4256
> #17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
> #18 BackendStartup (port=0x13967b8) at postmaster.c:4090
> #19 ServerLoop () at postmaster.c:1703
> #20 0x0031607c in PostmasterMain (argc=, argv= out>) at postmaster.c:1376
> #21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228
>

I have not any hypotheses what is reason - maybe we hit some libxml2 error
in xmlCopyNode function - now it is called for larger set of node types.

What I see, a error handling in original code was not probably correct.
Hard to say if attached patch fix it, but probably it is more correct than
now.

Regards

Pavel


>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 28b3eaaa20..173784da4b 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3720,31 +3720,42 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 
 	if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
 	{
-		xmlBufferPtr buf;
-		xmlNodePtr	cur_copy;
+		volatile xmlBufferPtr buf = NULL;
+		volatile xmlNodePtr	cur_copy = NULL;
 
-		buf = xmlBufferCreate();
+		PG_TRY();
+		{
+			int		bytes;
 
-		/*
-		 * The result of xmlNodeDump() won't contain namespace definitions
-		 * from parent nodes, but xmlCopyNode() duplicates a node along with
-		 * its required namespace definitions.
-		 */
-		cur_copy = xmlCopyNode(cur, 1);
+			buf = xmlBufferCreate();
 
-		if (cur_copy == NULL)
-			xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
-		"could not copy node");
+			if (buf == NULL || xmlerrcxt->err_occurred)
+xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+			"could not allocate xmlBuffer");
+
+			/*
+			 * The

Re: PostgreSQL vs SQL/XML Standards

2019-03-07 Thread Alvaro Herrera

On 2019-Mar-07, Alvaro Herrera wrote:

> On 2019-Feb-11, Chapman Flack wrote:
> 
> > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> > output for the no-libxml case that was left out of -5.
> 
> Pushed this one, with some trivial changes: I renamed and relocated
> Pavel's function to strdup-and-free and fused the new test cases to use
> less queries for the same functionality.  Naturally I had to adjust the
> expected files ... I tried to do my best but there's always a little
> something that sneaks under one's nose.

So we now have a double-free bug here or something ...  Too tired right
now to do anything about it.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02

== stack trace: 
pgsql.build/src/test/regress/tmp_check/data/core ==
[New LWP 20275]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `postgres: pgbuildfarm regression [local] SELECT  
 '.
Program terminated with signal 11, Segmentation fault.
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#1  0x76e28380 in xmlFreeNode () from /usr/lib/arm-linux-gnueabihf/libxml2.so.2
#2  0x00481f94 in xml_xmlnodetoxmltype (cur=, 
xmlerrcxt=) at xml.c:3751
#3  0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404, 
typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
#4  0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at 
nodeTableFuncscan.c:489
#5  tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at nodeTableFuncscan.c:318
#6  0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
#7  0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 , 
accessMtd=0x26db1c , node=0x14a83e0) at execScan.c:93
#8  ExecScan (node=0x14a83e0, accessMtd=0x26db1c , 
recheckMtd=0x23e640 ) at execScan.c:143
#9  0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
#10 0x00235630 in ExecProcNode (node=0x14a83e0) at 
../../../src/include/executor/executor.h:241
#11 ExecutePlan (execute_once=, dest=0x14b8d08, 
direction=, numberTuples=0, sendTuples=, 
operation=CMD_SELECT, use_parallel_mode=, planstate=0x14a83e0, 
estate=0x14a82a0) at execMain.c:1643
#12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=, 
count=0, execute_once=true) at execMain.c:362
#13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=, 
count=0, dest=) at pquery.c:929
#14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=, 
run_once=, dest=0x14b8d08, altdest=0x14b8d08, 
completionTag=0x7ea42414 "") at pquery.c:770
#15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at 
postgres.c:1215
#16 0x00393e8c in PostgresMain (argc=, argv=, 
dbname=0x0, username=) at postgres.c:4256
#17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
#18 BackendStartup (port=0x13967b8) at postmaster.c:4090
#19 ServerLoop () at postmaster.c:1703
#20 0x0031607c in PostmasterMain (argc=, argv=) 
at postmaster.c:1376
#21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2019-03-07 Thread Alvaro Herrera

On 2019-Feb-11, Chapman Flack wrote:

> xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> output for the no-libxml case that was left out of -5.

Pushed this one, with some trivial changes: I renamed and relocated
Pavel's function to strdup-and-free and fused the new test cases to use
less queries for the same functionality.  Naturally I had to adjust the
expected files ... I tried to do my best but there's always a little
something that sneaks under one's nose.

Thank you three for your persistence on this!

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2019-03-07 Thread Chapman Flack

On 3/7/19 10:08 AM, Alvaro Herrera wrote:
> I just pushed this one to pg12 only: 
>> xmltable-xmlexists-passing-mechanisms-3.patch

Thanks!

> I removed the words "first appears in SQL:2006" from the new doc
> paragraph; we used to have similar phrases scattered here and there but
> were removed years ago, on the grounds that only the most recent version
> is relevant (I couldn't find the commit, though).

In case there are similar qualifiers in the other patches (especially
the -docfix one, there are several), while respecting that general
principle, I would offer that there might be an argument for bending
it some here ...

... one of the main things challenging intuition and expectation in the
SQL/XML part of the standard is that there were radical changes to the
'XML' data type and the underlying data model between :2003 and :2006,
and most of what's in PostgreSQL was implemented to :2003 and still
conforms to that. (Being therefore, already, a bit of an exception
to the "only the most recent version is relevant" principle.)

So there could be good reason here to call out the distinctions when
they matter here, even if in other areas of the doc the practice is
not to call them out.

-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-03-07 Thread Alvaro Herrera

On 2019-Feb-11, Chapman Flack wrote:

> Interpreting the crickets as approval, I have changed the title of the
> CF entry, and the status back to Needs Review, with these patches
> attached:

I just pushed this one to pg12 only:

> xmltable-xmlexists-passing-mechanisms-3.patch

I removed the words "first appears in SQL:2006" from the new doc
paragraph; we used to have similar phrases scattered here and there but
were removed years ago, on the grounds that only the most recent version
is relevant (I couldn't find the commit, though).  Mostly left your
patch alone otherwise, other than adding some additional mark-up to the
doc changes.

Thanks!

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Re: PostgreSQL vs SQL/XML Standards

2019-03-07 Thread David Steele


On 3/6/19 6:16 PM, Chapman Flack wrote:

This CF entry shows Pavel and me as reviewers, but the included
patches were also produced by one or the other of us, so additional
review by someone who isn't us seems appropriate. :)

Would it make sense to remove one or both of us from the 'reviewers'
field in the app, to make it more obviously 'available' for reviewing?


I think it makes sense to remove both of you so I have done so.  I think 
it is assumed that co-authors will review each others work, and as you 
say it will make it harder for you to get additional reviewers.


Regards,
--
-David
da...@pgmasters.net

Re: PostgreSQL vs SQL/XML Standards

2019-03-06 Thread Chapman Flack

This CF entry shows Pavel and me as reviewers, but the included
patches were also produced by one or the other of us, so additional
review by someone who isn't us seems appropriate. :)

Would it make sense to remove one or both of us from the 'reviewers'
field in the app, to make it more obviously 'available' for reviewing?

One of the patches deals only with docs, so even someone who isn't
sure about reviewing XML functionality, but takes an interest in
documentation, would have a valuable role looking at that.

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-03-01 Thread Ramanarayana

Hi,
Yes it is working fine with \a option in psql.

Cheers
Ram 4.0

Re: PostgreSQL vs SQL/XML Standards

2019-03-01 Thread Chapman Flack

On 03/01/19 07:15, Ramanarayana wrote:
> Hi,
> I have tested bug fixes provided by all the patches. They are working
> great. I found one minor issue
> 
>  select * from xmltable('*' PASSING 'pre arg?>&deeppost' COLUMNS x XML PATH '/');
> 
> The above query returns the xml. But there is an extra plus symbol at the
> end
> 
> pre&deeppost+

Hi,

Are you sure that isn't the + added by psql when displaying a value
that contains a newline?

What happens if you repeat the query but after the psql command

  \a

to leave the output unaligned?

Thanks!
-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-02-10 Thread Chapman Flack

Note: I sent an email last night with updated patches, which was not received 
because of a spamhaus reputation issue for my email provider.

In working that out with my provider, at the moment I cannot send email at all, 
so I am using this comment to explain why the status went back to "needs 
review" with no new patches yet. I'll send them again when I can.

Re: PostgreSQL vs SQL/XML Standards

2019-02-05 Thread Chapman Flack

On 02/01/19 20:20, Michael Paquier wrote:
> On Thu, Jan 31, 2019 at 04:26:31PM +0100, Pavel Stehule wrote:
>> I'll mark this patch as ready for commiters.
> 
> For now I have moved the patch to the next CF, with the same status.

I wonder whether, given the move to next CF, it makes sense to change
the title of the CF entry from "XMLTABLE" to, more generically, XML
improvements, and get one or two more small changes in:

- get XMLPARSE(CONTENT... (and cast-to-xml with XMLOPTION=content) to
  succeed even for content with DTDs, so that the content subtype really
  does fully include the document subtype, aligning it with the SQL:2006+
  standard. I think this would be a simple patch that I can deliver early
  this month, and Tom found reports where the current behavior already
  bites people in pg_restore. Its only effect would be to allow a currently-
  failing case to succeed (and stop biting people).

- get XMLEXISTS and XMLTABLE to allow passing of named parameters. I have
  less of a sense of how difficult that might be, but I see that the libxml
  xpath API does allow them. I don't know whether they were left out of the
  original development just arbitrarily, or whether some effort was made
  and ran into some problem.

  The value of supporting the named parameters, especially when the library
  limits us to XPath 1.0, is that the XPath 1.0 limitation that a value
  passed as the context item can only be an XML 'document' only applies to
  the context item, not to named parameters. So if somebody is trying to
  port an expression ...'foo(.)' PASSING bar... and bar is not in document
  form, there's a simple rewrite to ...'foo($a) PASSING bar AS a...
  whereas if we can't do named parameters, frustration ensues.

Again, the only effect of such a change would be to permit something that
currently isn't possible. I don't think I can promise to work on a patch
for the second issue, but they both seem worthwhile, and I'm happy to
work on the first.

It seems to me these changes and the doc patch to go with them are
closely enough related to fit in one CF entry that's still smaller and
simpler than many, and that shouldn't be too difficult to review for v12,
but I'll defer to more experienced voices.

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-02-01 Thread Michael Paquier

On Thu, Jan 31, 2019 at 04:26:31PM +0100, Pavel Stehule wrote:
> I'll mark this patch as ready for commiters.

For now I have moved the patch to the next CF, with the same status.
--
Michael


signature.asc
Description: PGP signature

Re: PostgreSQL vs SQL/XML Standards

2019-01-31 Thread Pavel Stehule

so 26. 1. 2019 v 1:38 odesílatel Chapman Flack 
napsal:

> On 01/25/19 00:45, Pavel Stehule wrote:
> > pá 25. 1. 2019 v 5:46 odesílatel Chapman Flack 
> > napsal:
> >> I am:
> >> - re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged
> >>   (just so CF app does not lose track)
> >> - re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged
> >> - attaching for the first time xml-functions-type-docfix-1.patch
> >>
> >> The doc patch is made to go on top of the passing-mechanisms patch
>
> Realized xmltable-xmlexists-passing-mechanisms-1.patch didn't add
> a regression test.  Here attaching (or re-attaching):
>
> - xmltable-xpath-result-processing-bugfix-5.patch - unchanged
> - xmltable-xmlexists-passing-mechanisms-2.patch - now with test
> - xml-functions-type-docfix-1.patch - unchanged
>
> I'll venture a review opinion that all of this applies, builds, and passes
> check-world on top of 18c0da8, and that, of the issues I had identified at
> the start of this thread, these changes resolve the ones they set out to
> resolve.
>
> But the second two patches are my own work, so another reviewer is needed.
> The passing-mechanisms patch is tiny while the docfix patch is not, so
> there's an opening for a reviewer with an interest in documentation. :)
>
> There is still nothing in this patch set to address [1], though that
> also seems worth doing, perhaps in another patch, and probably not
> difficult, perhaps needing only a regex.
>
> And of course we're still saddled with all the unfixable limits
> of XPath 1.0; this patch set is fixing a few peripheral fixable things
> around that.
>
>
I am sending a review of these patches

xmltable-xpath-result-processing-bugfix-5.patch - I'll skip it - just all
tests passed

xmltable-xmlexists-passing-mechanisms-2.patch - this patch introduce new
PASSING mechanism BY VALUE - it is just syntactic sugar due compatibility
with standard. It is unhappy so previous implementation was broken and
introduced "BY REF" instead "BY VALUE", but this bug should be fixed 10
years ago. It change nothing, all tests passed and the documentation looks
ok.

Last patch is documentation only patch - I am thinking so the difference
and limits our implementation of XPath based functions are described well
and correctly.

I'll mark this patch as ready for commiters.

Regards

Pavel


> -Chap
>
>
> [1]
> https://www.postgresql.org/message-id/5BD1C44B.6040300%40anastigmatix.net
>

Re: PostgreSQL vs SQL/XML Standards

2019-01-25 Thread Chapman Flack

On 01/25/19 22:53, Pavel Stehule wrote:

> Documentation review will be harder - I am not a native speaker and I have
> not a necessary knowledges of XQuery (probably only you have this
> knowledge).

The doc patch is enough that I think it would be ideal to somehow attract
a native speaker who has interest in documentation to review it.
Even if that is someone without much XQuery-specific knowledge—I'll happily
answer whatever questions they ask about that. :)

-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-01-25 Thread Chapman Flack

On 01/25/19 19:37, Chapman Flack wrote:
> There is still nothing in this patch set to address [1], though that
> also seems worth doing, perhaps in another patch, and probably not
> difficult, perhaps needing only a regex.

Heck, it could be even simpler than that. If some XML input has a DTD,
the attempt to parse it as 'content' with libxml is sure to fail early
in the parse (because that's where the DTD is found). If libxml reports
enough error detail to show it failed because of a DTD—and it appears to:

DETAIL:  line 1: StartTag: invalid element name

—then simply recognize that error and reparse as 'document' on the spot.
The early-failing first parse won't have cost much, and there is probably
next to nothing to gain by trying to be any more clever.

The one complication might that there seem to be versions of libxml that
report error detail differently (hence the variant regression test expected
files), so the code might have to recognize a couple different forms.

-Chap

> [1] https://www.postgresql.org/message-id/5BD1C44B.6040300%40anastigmatix.net

Re: PostgreSQL vs SQL/XML Standards

2019-01-25 Thread Pavel Stehule

Hi

so 26. 1. 2019 v 4:25 odesílatel Chapman Flack 
napsal:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   not tested
> Documentation:not tested
>

maybe more simple will be work with original commitfest subscriptions. I
can do review of xmltable-xmlexists-passing-mechanisms-2.patch.
Documentation review will be harder - I am not a native speaker and I have
not a necessary knowledges of XQuery (probably only you have this
knowledge).


>
> As the reporter of the issues raised in this email thread, I've reviewed
> the first patch
> and contributed the second and third.
>
> WHAT THE PATCHES DO:
>
> xmltable-xpath-result-processing-bugfix-5.patch contains code changes
> correcting
> a subset of the issues that were raised in this email thread.
>
> xmltable-xmlexists-passing-mechanisms-2.patch adjusts the grammar to allow
> the XML
> parameter passing mechanism BY VALUE as well as BY REF. Both are ignored,
> but
> formerly BY VALUE was a syntax error, which was unintuitive considering
> that BY VALUE
> is the passing mechanism PostgreSQL implements (XML node identities are
> not preserved).
>
> xml-functions-type-docfix-1.patch conforms the documentation to reflect
> the changes in
> this patch set and the limitations identified in this thread.
>
> WHAT I HAVE REVIEWED:
>
> I have applied all three patches over 18c0da8 and confirmed that
> installcheck-world passes
> and that the code changes resolve the issues they set out to resolve.
>
> I've made no entry for "spec compliant" because the question is moot; the
> spec is written
> in terms of the XQuery language, types, and concepts, and these facilities
> in PG are
> implemented on XPath 1.0, which doesn't have those. But the changes in
> this patch set
> do make the PG behaviors more, well, closely analogous to the way the spec
> compliant
> functions would behave.
>
> WHAT I HAVE NOT REVIEWED:
>
> The passing-mechanisms and docfix patches are my own work, so there should
> be another
> reviewer who is not me. I've looked closely at the technical, SQL/XML
> behavior aspects already,
> but a reviewer with an eye for documentation would be welcome.
>
> I'll venture my opinion that this is ready-for-committer to the extent of
> my own review, but will
> leave the status at needs-review for a not-me reviewer to update.

Re: PostgreSQL vs SQL/XML Standards

2019-01-25 Thread Chapman Flack

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

As the reporter of the issues raised in this email thread, I've reviewed the 
first patch
and contributed the second and third.

WHAT THE PATCHES DO:

xmltable-xpath-result-processing-bugfix-5.patch contains code changes correcting
a subset of the issues that were raised in this email thread.

xmltable-xmlexists-passing-mechanisms-2.patch adjusts the grammar to allow the 
XML
parameter passing mechanism BY VALUE as well as BY REF. Both are ignored, but
formerly BY VALUE was a syntax error, which was unintuitive considering that BY 
VALUE
is the passing mechanism PostgreSQL implements (XML node identities are not 
preserved).

xml-functions-type-docfix-1.patch conforms the documentation to reflect the 
changes in
this patch set and the limitations identified in this thread.

WHAT I HAVE REVIEWED:

I have applied all three patches over 18c0da8 and confirmed that 
installcheck-world passes
and that the code changes resolve the issues they set out to resolve.

I've made no entry for "spec compliant" because the question is moot; the spec 
is written
in terms of the XQuery language, types, and concepts, and these facilities in 
PG are
implemented on XPath 1.0, which doesn't have those. But the changes in this 
patch set
do make the PG behaviors more, well, closely analogous to the way the spec 
compliant
functions would behave.

WHAT I HAVE NOT REVIEWED:

The passing-mechanisms and docfix patches are my own work, so there should be 
another
reviewer who is not me. I've looked closely at the technical, SQL/XML behavior 
aspects already,
but a reviewer with an eye for documentation would be welcome.

I'll venture my opinion that this is ready-for-committer to the extent of my 
own review, but will
leave the status at needs-review for a not-me reviewer to update.

Re: PostgreSQL vs SQL/XML Standards

2019-01-24 Thread Pavel Stehule

pá 25. 1. 2019 v 5:46 odesílatel Chapman Flack 
napsal:

> Hi,
>
> On 01/21/19 01:33, Pavel Stehule wrote:
> > ne 20. 1. 2019 23:13 odesílatel Chapman Flack 
> > napsal:
>
> >> form (whether or not the word LATERAL is used), and re-executes xmltable
> >> whenever the referenced column value changes. In that case, whether the
> >> default argument is evaluated at function entry or later doesn't seem
> >> to matter: the function is re-executed, so evaluating the new default
> >> at the time of entry is sufficient.
> >
> > it has sense only for volatile functions. it was not often. On second
> hand
> > deferred evaluation shoul not be a problem, and more, it is evaluated
> only
> > when it is necessary, what is more sensible for me.
>
> That makes sense. I trimmed the language about input rows and referring to
> earlier columns, and put more emphasis on the usefulness of evaluating
> volatile functions only when needed.
>
> I am:
>
> - re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged
>   (just so CF app does not lose track)
>
> - re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged
>
> - attaching for the first time xml-functions-type-docfix-1.patch
>
> The doc patch is made to go on top of the passing-mechanisms patch
> (there were some doc changes in passing-mechanisms, changed again in
> the new patch to be links to the added Limits and Compatibility section).
>
> I hope the patched docs describe accurately what we have at this point.
>

looks well

Pavel


> Regards,
> -Chap
>

Re: PostgreSQL vs SQL/XML Standards

2019-01-21 Thread Chapman Flack

Hi,

In two places in the XMLTABLE implementation (XmlTableFetchRow, iterating
through a nodeset returned by the row_expression, and XmlTableGetValue,
going through a nodeset returned by the column_expression), the iteration
proceeds in index order through xpathobj->nodesetval->nodeTab.

The same happens in the older xml_xpathobjtoxmlarray, producing the array
result of the xpath() function.

In  one finds this unsettling comment:

 xmlNodePtr *nodeTab; /* array of nodes in no particular order */


So far, no matter what oddball XPath expressions I think up, nodeTab
does seem to end up in document order. It would be comforting to document
that, but the comment suggests it might not be guaranteed.

Are you aware of any statement I might have missed in the libxml docs,
where they commit to XPath evaluation returning a nodeset where nodeTab
has a predictable order? If there is a statement to that effect somewhere,
it might be worth citing in a comment in xml.c.

Regards,
-Chap



Here is a fairly oddball query, generating an XML document with about
3k names in descending order, partitioning those elements into subsets
having the same second character of the name, and unioning those back
together. You'd sort of expect that to produce a result nodeTab in goofy
order if anything could, but no, the results seem verifiably in descending
order every time, suggesting the result nodeTab is indeed being put
in document order before the XPath evaluator returns. If only they would
document that.

WITH nodeset_order_check(name, prev_name) AS (
 SELECT
  x, lag(x) OVER (ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
  FROM
   (SELECT
 string_agg(DISTINCT
  'x/n[substring(.,2,1) = "' || substring(proname from 2 for 1) || '"]',
  '|')
 FROM pg_proc) AS rowx(rowx),
   (SELECT
 XMLELEMENT(NAME x,
XMLAGG(XMLELEMENT(NAME n, proname) ORDER BY proname DESC))
 FROM pg_proc) AS src(src),
   XMLTABLE(rowx PASSING src COLUMNS x text PATH '.')
)
SELECT every(prev_name >= name IS NOT FALSE) FROM nodeset_order_check;
 every
---
 t
(1 row)

Re: PostgreSQL vs SQL/XML Standards

2019-01-20 Thread Pavel Stehule

ne 20. 1. 2019 23:13 odesílatel Chapman Flack 
napsal:

> On 01/20/19 12:48, Pavel Stehule wrote:
> >>
> >> Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
> >> functions," is more likely to perplex readers than to enlighten them.
> >> What it says about column_expression does not seem to lead to any useful
> >> difference from the behavior if it were "just like regular PostgreSQL
> >> functions".
> >
> > regular Postgres' functions has evaluated all arguments before own
> > execution. I think so this note is related much more to expressions used
> as
> > defaults.
>
> Sure, but again, is there an example, or can one easily be constructed,
> that shows the default expressions working in such a way?
>
> I am not able to use a default expression to refer to an earlier
> column in the column list of the xmltable call.
>


probably you can see the effect if you use some volatile function ..
random(), nextval(),

I think so notice in documentation was not a motivation to use it. It was
explanation of implementation and warnings against side effect.


>
> I am able to use a default expression to refer to a column of an earlier
> FROM item in the enclosing SELECT. But such a query ends up having LATERAL
> form (whether or not the word LATERAL is used), and re-executes xmltable
> whenever the referenced column value changes. In that case, whether the
> default argument is evaluated at function entry or later doesn't seem
> to matter: the function is re-executed, so evaluating the new default
> at the time of entry is sufficient.
>

it has sense only for volatile functions. it was not often. On second hand
deferred evaluation shoul not be a problem, and more, it is evaluated only
when it is necessary, what is more sensible for me.


> So, I have still not been able to construct a query that requires the
> deferred evaluation behavior. But perhaps there is a way I haven't
> thought of.
>
> Regards,
> -Chap
>

Re: PostgreSQL vs SQL/XML Standards

2019-01-20 Thread Chapman Flack

On 01/20/19 20:07, Chapman Flack wrote:

> So it appears that this example does not depend on any special treatment
> of the default_expression.
> 
> Is there an example that can be constructed that would depend on the
> special treatment (in which case, the PL/Java implementation would be
> unable to produce the same result)?

I see that I briefly forgot one difference that does matter, namely the
timing of a call to a volatile function like nextval. That detail certainly
needs to remain in the docs.

I am left wondering if that might be the only effect of the deferred
argument evaluation that really does matter.

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-01-20 Thread Chapman Flack

On 01/20/19 17:13, Chapman Flack wrote:
> On 01/20/19 12:48, Pavel Stehule wrote:
>> regular Postgres' functions has evaluated all arguments before own
>> execution. I think so this note is related much more to expressions used as
>> defaults.
> 
> Sure, but again, is there an example, or can one easily be constructed,
> that shows the default expressions working in such a way?

To make my question more concrete, here is the current regression test
query that uses an SQL expression for a default:

SELECT xmltable.*
FROM
  xmltest2,
  xmltable(('/d/r/' || lower(_path) || 'c')
PASSING x
COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
 a

 11
 12
 13
 14
(4 rows)

Here is the same query desugared into a call of the PL/Java "xmltable":

SELECT xmltable.*
FROM
  xmltest2,
  LATERAL (SELECT x AS ".",   ascii(_path) - 54 AS "A_DFT") AS p,
  "xmltable"(('/d/r/' || lower(_path) || 'c'),
PASSING => p,
COLUMNS => ARRAY[
  'let $a := x return xs:int(if (exists($a)) then $a else $A_DFT)'
]
  ) AS (a int);
 a

 11
 12
 13
 14
(4 rows)

So the PL/Java version works and produces the same results. And yet
it certainly is a "regular PostgreSQL function" made with CREATE FUNCTION,
no special treatment of arguments, all evaluated before entry in the usual
way.

So it appears that this example does not depend on any special treatment
of the default_expression.

Is there an example that can be constructed that would depend on the
special treatment (in which case, the PL/Java implementation would be
unable to produce the same result)?

Regards,
-Chap

... the xs:int(...) cast above is needed for now, just because the PL/Java
implementation does not yet include the standard's algorithm to find
the right cast automatically.

Re: PostgreSQL vs SQL/XML Standards

2019-01-20 Thread Chapman Flack

On 01/20/19 12:48, Pavel Stehule wrote:
>>
>> Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
>> functions," is more likely to perplex readers than to enlighten them.
>> What it says about column_expression does not seem to lead to any useful
>> difference from the behavior if it were "just like regular PostgreSQL
>> functions".
> 
> regular Postgres' functions has evaluated all arguments before own
> execution. I think so this note is related much more to expressions used as
> defaults.

Sure, but again, is there an example, or can one easily be constructed,
that shows the default expressions working in such a way?

I am not able to use a default expression to refer to an earlier
column in the column list of the xmltable call.

I am able to use a default expression to refer to a column of an earlier
FROM item in the enclosing SELECT. But such a query ends up having LATERAL
form (whether or not the word LATERAL is used), and re-executes xmltable
whenever the referenced column value changes. In that case, whether the
default argument is evaluated at function entry or later doesn't seem
to matter: the function is re-executed, so evaluating the new default
at the time of entry is sufficient.

So, I have still not been able to construct a query that requires the
deferred evaluation behavior. But perhaps there is a way I haven't
thought of.

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-01-20 Thread Pavel Stehule

>
> Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
> functions," is more likely to perplex readers than to enlighten them.
> What it says about column_expression does not seem to lead to any useful
> difference from the behavior if it were "just like regular PostgreSQL
> functions".
>

regular Postgres' functions has evaluated all arguments before own
execution. I think so this note is related much more to expressions used as
defaults.


> The part about usefully using volatile functions in default_expression
> remains important to mention.
>
> The statement in an earlier paragraph that "It is possible for a
> default_expression to reference the value of output columns that appear
> prior to it in the column list" still may need some rework, because it
> does not seem possible to refer to prior columns /within xmltable's own
> column list/ (though that could be useful, and I think it is intended
> in the standard). Doesn't seem to work in Oracle either
>
> While it does seem possible to refer to columns supplied by
> /earlier FROM items in the containing SELECT/, that simply results in
> multiple calls of xmltable, just as in the column_expression case.
>
> >> I think the same example would produce the same output even with feature
> >> (2)
> >> absent. It's LATERAL doing the magic there. So I am doubtful that it
> >> demonstrates (2).
> >
> > LATERAL is necessary, because  XMLTABLE can be used only in FROM clause,
> > and in this case XMLTABLE has mutable parameters.
>
> For what it's worth, if I repeat the query with the word LATERAL removed,
> it works just the same. I think that's simply because the LATERAL behavior
> is implied for a function-call FROM item, so the explicit word isn't
> needed.
> The main thing is, evaluation proceeds in the way described under LATERAL
> in the ref page for SELECT.
>

In this case the LATERAL keyword is optional - with or without this keyword
it is lateral join.

Regards

Pavel


>
> Regards,
> -Chap
>

Re: PostgreSQL vs SQL/XML Standards

2019-01-20 Thread Chapman Flack

On 01/20/19 11:55, Pavel Stehule wrote:
> input row mean a row of processed relation. The xml value from this row can
> be transformed to 0..M output rows.
> 
> The column filter expressions are evaluated once per input rows, default
> expressions are evaluated when it is necessary - possibly once for any
> output row
> ...
> it is expected - the input relation has four lines - the function was 4x
> initialized. In this case 1 call of xmltable produces 1 row.

Good ... I think we're converging on a shared understanding.

I am just used to speaking of xmltable as a simple function that executes
one row_expression one time against one supplied input, and generates 0..M
output rows from it. If there are multiple rows in a relation that you
want to apply xmltable to, that's a simple matter of calling xmltable
multiple times (which is just what SQL is doing when the xmltable is on
the RHS of an explicit or implied LATERAL).

The upshot seems to be that there is nothing necessarily special about
how xmltable treats its column_expressions: it compiles them once upon
entry to the function, as one would naïvely expect. (Or, if there is
anything more special about how the column_expression is being handled,
it seems not to be necessary, as the naïve behavior would be adequate.)

Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
functions," is more likely to perplex readers than to enlighten them.
What it says about column_expression does not seem to lead to any useful
difference from the behavior if it were "just like regular PostgreSQL
functions".

The part about usefully using volatile functions in default_expression
remains important to mention.

The statement in an earlier paragraph that "It is possible for a
default_expression to reference the value of output columns that appear
prior to it in the column list" still may need some rework, because it
does not seem possible to refer to prior columns /within xmltable's own
column list/ (though that could be useful, and I think it is intended
in the standard). Doesn't seem to work in Oracle either

While it does seem possible to refer to columns supplied by
/earlier FROM items in the containing SELECT/, that simply results in
multiple calls of xmltable, just as in the column_expression case.

>> I think the same example would produce the same output even with feature
>> (2)
>> absent. It's LATERAL doing the magic there. So I am doubtful that it
>> demonstrates (2).
> 
> LATERAL is necessary, because  XMLTABLE can be used only in FROM clause,
> and in this case XMLTABLE has mutable parameters.

For what it's worth, if I repeat the query with the word LATERAL removed,
it works just the same. I think that's simply because the LATERAL behavior
is implied for a function-call FROM item, so the explicit word isn't needed.
The main thing is, evaluation proceeds in the way described under LATERAL
in the ref page for SELECT.

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-01-20 Thread Pavel Stehule

ne 20. 1. 2019 v 17:06 odesílatel Chapman Flack 
napsal:

> Hi,
>
> On 01/20/19 00:26, Pavel Stehule wrote:
> >>> column expressions are evaluated once per row, but XPath  expression is
> >>> compiled per row too, if I remember well.
> >>
> >> I looked for evidence of that in the code, but did not find it; the
> >> compilation appears to happen in XmlTableSetColumnFilter, which is
> >> called from tfuncInitialize.
> >
> > it is called per input row.
>
> I could be misunderstanding what you mean by 'input row' here.
>

input row mean a row of processed relation. The xml value from this row can
be transformed to 0..M output rows.

The column filter expressions are evaluated once per input rows, default
expressions are evaluated when it is necessary - possibly once for any
output row


>
> If I write a simple xmltable query where the row_expression produces
> six rows:
>
> SELECT *
> FROM
>   xmltable('pg_am/row'
> PASSING table_to_xml('pg_am', true, false, '')
> COLUMNS amname text PATH 'amname');
>
> six rows are produced, though a breakpoint set on XmlTableSetColumnFilter
> fires only once, from tfuncInitialize at the start of xmltable's execution.
>
>
> By contrast, in the regression test example with PATH
> ''||lower(_path)||'c',
> four rows are produced and the breakpoint fires four times.
>

it is expected - the input relation has four lines - the function was 4x
initialized. In this case 1 call of xmltable produces 1 row.


>
> However ... that isn't because one call to xmltable is producing four rows
> and recomputing the column_expression each time.
>
> It's because that xmltable is the RHS of a LATERAL, and the LHS of the
> LATERAL is producing four tuples with different values of columns the RHS
> depends on, so the RHS (xmltable) is being called four different times,
> producing one row each time, still with XmlTableSetColumnFilter being
> called
> only during initialization.
>
>
> > sure - using any expression in PATH clause should to demonstrate this
> > functionality.
>
> Well, there seem to be two distinct features touched on in the docs:
>
> 1. The column_expression is allowed to be an expression, not restricted
>to a string literal as it is in the standard (and Oracle).
>
> 2. Not only is it an expression, but it's an expression whose evaluation
>is deferred and can happen more than once in the same xmltable call.
>

yes, it is any expressions used there are evaluated per 1 call. If input
relation has N rows, then xmltable is initialized N times - by different
words. xmltable is evaluated independently for any processed XML document.



> The example in the regression tests certainly demonstrates (1). Without
> (1),
> it would be a syntax error.
>
> I think the same example would produce the same output even with feature
> (2)
> absent. It's LATERAL doing the magic there. So I am doubtful that it
> demonstrates (2).
>

LATERAL is necessary, because  XMLTABLE can be used only in FROM clause,
and in this case XMLTABLE has mutable parameters.


> I put some effort last night into trying to even construct any query that
> would demonstrate (2), and I came up short, but that could be my lack of
> imagination. (Somewhere in that effort I happened to notice that xmltable
> doesn't seem to be parsed successfully inside a ROWS FROM (...) construct,
> which might be another issue for another time)
>
> So, if you have a way to build a query that demonstrates (2), my aha!
> moment
> might then arrive.
>
> Regards,
> -Chap
>

Re: PostgreSQL vs SQL/XML Standards

2019-01-20 Thread Chapman Flack

Hi,

On 01/20/19 00:26, Pavel Stehule wrote:
>>> column expressions are evaluated once per row, but XPath  expression is
>>> compiled per row too, if I remember well.
>>
>> I looked for evidence of that in the code, but did not find it; the
>> compilation appears to happen in XmlTableSetColumnFilter, which is
>> called from tfuncInitialize.
> 
> it is called per input row.

I could be misunderstanding what you mean by 'input row' here.

If I write a simple xmltable query where the row_expression produces
six rows:

SELECT *
FROM
  xmltable('pg_am/row'
PASSING table_to_xml('pg_am', true, false, '')
COLUMNS amname text PATH 'amname');

six rows are produced, though a breakpoint set on XmlTableSetColumnFilter
fires only once, from tfuncInitialize at the start of xmltable's execution.

By contrast, in the regression test example with PATH ''||lower(_path)||'c',
four rows are produced and the breakpoint fires four times.

However ... that isn't because one call to xmltable is producing four rows
and recomputing the column_expression each time.

It's because that xmltable is the RHS of a LATERAL, and the LHS of the
LATERAL is producing four tuples with different values of columns the RHS
depends on, so the RHS (xmltable) is being called four different times,
producing one row each time, still with XmlTableSetColumnFilter being called
only during initialization.

> sure - using any expression in PATH clause should to demonstrate this
> functionality.

Well, there seem to be two distinct features touched on in the docs:

1. The column_expression is allowed to be an expression, not restricted
   to a string literal as it is in the standard (and Oracle).

2. Not only is it an expression, but it's an expression whose evaluation
   is deferred and can happen more than once in the same xmltable call.

The example in the regression tests certainly demonstrates (1). Without (1),
it would be a syntax error.

I think the same example would produce the same output even with feature (2)
absent. It's LATERAL doing the magic there. So I am doubtful that it
demonstrates (2).

I put some effort last night into trying to even construct any query that
would demonstrate (2), and I came up short, but that could be my lack of
imagination. (Somewhere in that effort I happened to notice that xmltable
doesn't seem to be parsed successfully inside a ROWS FROM (...) construct,
which might be another issue for another time)

So, if you have a way to build a query that demonstrates (2), my aha! moment
might then arrive.

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-01-19 Thread Pavel Stehule

ne 20. 1. 2019 v 6:06 odesílatel Chapman Flack 
napsal:

> On 01/19/19 23:49, Pavel Stehule wrote:
>
> > If I remember, described functionality was implemented in early patches,
> > but was removed to simplify code. To now, there was not a request to do
> it.
> >
> > Unfortunately, the documentation was not fixed.
>
> I'll do that, as I'm working in there anyway. :)
>
> > column expressions are evaluated once per row, but XPath  expression is
> > compiled per row too, if I remember well.
>
> I looked for evidence of that in the code, but did not find it; the
> compilation appears to happen in XmlTableSetColumnFilter, which is
> called from tfuncInitialize.
>

it is called per input row.


> I can't guarantee I didn't miss something, though.
>
> > We designed it more tolerant as
> > we expected possibility to store XPath expression in one column and data
> in
> > second column.
>
> Perhaps if I could see an example showing the functionality... The nearest
> I could find in the regression tests was the test with
>  COLUMNS a int PATH '' || lower(_path) || 'c'
> but, again, I think that test only demonstrates how LATERAL works, not
> any behavior specific to xmltable.
>

the main reason for this was not to support support some specific patterns
- just used design doesn't requires stronger changes in executor or
internal multi call caching of some internal data. Probably you can find
discussion related to this topic in mailing list archive.

sure - using any expression in PATH clause should to demonstrate this
functionality.



> Regards,
> -Chap
>
>

Re: PostgreSQL vs SQL/XML Standards

2019-01-19 Thread Chapman Flack

On 01/19/19 23:49, Pavel Stehule wrote:

> If I remember, described functionality was implemented in early patches,
> but was removed to simplify code. To now, there was not a request to do it.
> 
> Unfortunately, the documentation was not fixed.

I'll do that, as I'm working in there anyway. :)

> column expressions are evaluated once per row, but XPath  expression is
> compiled per row too, if I remember well.

I looked for evidence of that in the code, but did not find it; the
compilation appears to happen in XmlTableSetColumnFilter, which is
called from tfuncInitialize.

I can't guarantee I didn't miss something, though.

> We designed it more tolerant as
> we expected possibility to store XPath expression in one column and data in
> second column.

Perhaps if I could see an example showing the functionality... The nearest
I could find in the regression tests was the test with
 COLUMNS a int PATH '' || lower(_path) || 'c'
but, again, I think that test only demonstrates how LATERAL works, not
any behavior specific to xmltable.

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-01-19 Thread Pavel Stehule

ne 20. 1. 2019 v 5:37 odesílatel Chapman Flack 
napsal:

> Working slowly through the documentation, I came upon:
>
>  For XMLTABLE:
>
>   - The xmltable function produces a table based on the given XML value,
> an XPath filter to extract rows, and an optional set of column
> definitions.
> ...
> The mandatory COLUMNS clause specifies the list of columns ...
> ^
> if the COLUMNS clause is omitted, the rows in the result set contain
> 
> a single column of type xml containing the data matched by
> row_expression.
>
>This documentation seems undecided on whether the COLUMNS clause
>is mandatory or optional.
>
>It is mandatory in the SQL standard. It's mandatory in our grammar.
>We give a syntax_error if it's omitted.
>
>Is some of the documentation left over from an earlier contemplated
>design of having the clause be optional?
>
>Oracle does seem to allow the clause to be omitted, and produces a
>single xml column, as described. Was there an earlier plan to imitate
>Oracle's nonstandard behavior on that point? (Hardly seems worth the
>effort, as porting an Oracle query depending on it would simply entail
>adding COLUMNS COLUMN_VALUE XML PATH '.' and then it's portable and
>standard.)
>

If I remember, described functionality was implemented in early patches,
but was removed to simplify code. To now, there was not a request to do it.

Unfortunately, the documentation was not fixed.


>
>  - It is possible for a default_expression to reference the value of
>output columns that appear prior to it in the column list, so the
>default of one column may be based on the value of another column.
>
>   Is there an example that clearly shows this to work? If I write a
>   default_expression referring to a prior column in /xmltable's own/
>   column list, I get an undefined_column error. I can successfully refer
>   to a column of /an earlier FROM item in the SELECT/, but I am not sure
>   that demonstrates the behavior claimed here.
>
>   There is what looks like an example among the regression tests
>   (the one with DEFAULT ascii(_path) - 54), but that seems only to
>   demonstrate xmltable getting invoked four times (as documented for
>   LATERAL), not a single xmltable invocation producing multiple rows
>   with recomputed defaults.
>
>   If it's any comfort, I haven't gotten Oracle's xmltable to recognize
>   earlier columns in its own column list either.
>
>  - Unlike regular PostgreSQL functions, column_expression and
>default_expression are not evaluated to a simple value before calling
>the function. column_expression is normally evaluated exactly once
>per input row, and default_expression is evaluated each time a default
>is needed for a field.
>
>   I've already covered the question about default_expression, but what
>   this passage says about column_expression seems, at least, ambiguously
>   worded, too:
>
>   It goes without saying that /the XPath evaluator/ evaluates the
>   column_expression exactly once per input row. In the standard, that's
>   the only per-row evaluation happening; the column_expression SQL value
>   only gets compiled to an XPath expression once at the start. (In fact,
>   in the standard, it can't even be an arbitrary SQL expression, only a
>   string literal. Oracle enforces that too.)
>

column expressions are evaluated once per row, but XPath  expression is
compiled per row too, if I remember well. We designed it more tolerant as
we expected possibility to store XPath expression in one column and data in
second column.

Regards

Pavel


>
>   It seems that our implementation is meant to extend the standard and
>   actually allow the column_expression to vary per-row, and go through
>   the XPath expression compiler each time. The regression test with
>COLUMNS a int PATH '' || lower(_path) || 'c'
>   seems to be intended to confirm that behavior. But again, I think
>   it is only confirming that LATERAL results in xmltable being called
>   four consecutive times, with a different PATH in each call. It does
>   not seem to demonstrate a single xmltable call doing anything special
>   with recompiling a column path.
>
> Am I overlooking something?
>
> Regards,
> -Chap
>

Re: PostgreSQL vs SQL/XML Standards

2019-01-19 Thread Chapman Flack

Working slowly through the documentation, I came upon:

 For XMLTABLE:

  - The xmltable function produces a table based on the given XML value,
an XPath filter to extract rows, and an optional set of column
definitions.
...
The mandatory COLUMNS clause specifies the list of columns ...
^
if the COLUMNS clause is omitted, the rows in the result set contain

a single column of type xml containing the data matched by
row_expression.

   This documentation seems undecided on whether the COLUMNS clause
   is mandatory or optional.

   It is mandatory in the SQL standard. It's mandatory in our grammar.
   We give a syntax_error if it's omitted.

   Is some of the documentation left over from an earlier contemplated
   design of having the clause be optional?

   Oracle does seem to allow the clause to be omitted, and produces a
   single xml column, as described. Was there an earlier plan to imitate
   Oracle's nonstandard behavior on that point? (Hardly seems worth the
   effort, as porting an Oracle query depending on it would simply entail
   adding COLUMNS COLUMN_VALUE XML PATH '.' and then it's portable and
   standard.)

 - It is possible for a default_expression to reference the value of
   output columns that appear prior to it in the column list, so the
   default of one column may be based on the value of another column.

  Is there an example that clearly shows this to work? If I write a
  default_expression referring to a prior column in /xmltable's own/
  column list, I get an undefined_column error. I can successfully refer
  to a column of /an earlier FROM item in the SELECT/, but I am not sure
  that demonstrates the behavior claimed here.

  There is what looks like an example among the regression tests
  (the one with DEFAULT ascii(_path) - 54), but that seems only to
  demonstrate xmltable getting invoked four times (as documented for
  LATERAL), not a single xmltable invocation producing multiple rows
  with recomputed defaults.

  If it's any comfort, I haven't gotten Oracle's xmltable to recognize
  earlier columns in its own column list either.

 - Unlike regular PostgreSQL functions, column_expression and
   default_expression are not evaluated to a simple value before calling
   the function. column_expression is normally evaluated exactly once
   per input row, and default_expression is evaluated each time a default
   is needed for a field.

  I've already covered the question about default_expression, but what
  this passage says about column_expression seems, at least, ambiguously
  worded, too:

  It goes without saying that /the XPath evaluator/ evaluates the
  column_expression exactly once per input row. In the standard, that's
  the only per-row evaluation happening; the column_expression SQL value
  only gets compiled to an XPath expression once at the start. (In fact,
  in the standard, it can't even be an arbitrary SQL expression, only a
  string literal. Oracle enforces that too.)

  It seems that our implementation is meant to extend the standard and
  actually allow the column_expression to vary per-row, and go through
  the XPath expression compiler each time. The regression test with
   COLUMNS a int PATH '' || lower(_path) || 'c'
  seems to be intended to confirm that behavior. But again, I think
  it is only confirming that LATERAL results in xmltable being called
  four consecutive times, with a different PATH in each call. It does
  not seem to demonstrate a single xmltable call doing anything special
  with recompiling a column path.

Am I overlooking something?

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-01-13 Thread Chapman Flack

There is a bug that remains, in the
else if (xpathobj->type == XPATH_STRING)
case.

As it is now, it simply passes the string value of the result
into the output column's type-input function, regardless of the
output column type.

If the output column type is xml, this will attempt to parse the
string as xml. The result should simply be xml content consisting of
a text node representing the string (as by XMLTEXT()). If it contains
XML metacharacters, they should be escaped.

For a non-xml output column, the string should be used directly,
as it is now.

# select * from xmltable('.' passing xmlelement(name a)
columns a text path '""', b xml path '""');
   a|   b
+
  | 

Oracle fiddle for comparison:

https://dbfiddle.uk/?rdbms=oracle_18&fiddle=26f91a9e55a6908c2bcf848b464ca381

A   B
  

-Chap

Re: PostgreSQL vs SQL/XML Standards

2019-01-12 Thread Pavel Stehule

ne 13. 1. 2019 v 0:39 odesílatel Chapman Flack 
napsal:

> On 12/30/18 03:23, Pavel Stehule wrote:
> > Unfortunately, there is a different releases of libxml2 with different
> > error reporting and it is hard (impossible) to prepare for all variants.
> :-/
> >
> > I prepare xml.out for my FC29 (fresh libxml2) and for no support xml.
> > Other I prepare by patching - and this error (in context) is expected.
>
> It turns out that the variant was already accounted for in the xml_2.out
> variant result file, it just needed to have the new results added.
>
> Done in xmltable-xpath-result-processing-bugfix-4.patch attached.
>
>
> On 12/31/18 01:03, Pavel Stehule wrote:
> > po 31. 12. 2018 v 3:15 odesílatel Chapman Flack 
> > napsal:
> >> But the PostgreSQL situation is a little more strange. PG uses BY VALUE
> >> semantics as the default when no passing method is specified. PG also
> uses
> >> BY VALUE semantics when BY REF is explicitly requested, which is rude,
> >> just like Oracle. But why should an explicit specification of BY VALUE
> >> (which is, after all, the semantics we're going to use anyway!) produce
> >> this?
> >>
> >> ERROR:  syntax error at or near "value"
> >>
> >> To me, that doesn't seem like least astonishment.
> >>
> >> I am not seeing what would be complicated about removing that
> astonishment
> >> by simply allowing the grammar productions to also consume BY VALUE and
> >> ignore it.
> >
> > ok - I am not against implementation of ignored BY VALUE. But I don't
> like
> > a idea to disable BY REF.
>
> Done in attached xmltable-xmlexists-passing-mechanisms-1.patch along with
> some corresponding documentation adjustments.
>
> I am still working on more extensive documentation, but it seemed best
> to include the changes related to BY REF / BY VALUE in the same patch
> with the grammar change.
>

looks well, thank you for patch

Pavel


> -Chap
>

Re: PostgreSQL vs SQL/XML Standards

2019-01-10 Thread Pavel Stehule

Hi

čt 10. 1. 2019 v 14:00 odesílatel Arthur Zakirov 
napsal:

> Hello Pavel,
>
> On 09.11.2018 07:07, Pavel Stehule wrote:
> > I used your patch and append regress tests. I checked the result against
> > Oracle.
>
> I checked the patch with Chap cases. The patch fixes handling of
> boolean, number types which mentioned in the wiki.
>
> I have a few comments related to the code and the documentation. I
> attached the patch, which fixes it.
>
> There is an example in the documentation:
>
> SELECT xmltable.*
>FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text);
> element
> --
> Hello2a2   bbbCC
>
> With the patch XMLTABLE returns different result now.
>
> copy_and_safe_free_xmlchar() function should be hid by #ifdef
> USE_LIBXML, otherwise I get an error if I build the Postgres without
> --with-libxml.
>
> There is a comment within XmlTableGetValue(). I changed it, mainly I
> used Markus patch from the related thread mentioned by Alvaro.
>
> Please see the changes in the patch.
>

I merged your changes, and fixed regress tests.

Thank you for patch

Regards

Pavel



> --
> Arthur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 90d67f1acf..06f3f69073 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10999,9 +10999,9 @@ $$ AS data;
 
 SELECT xmltable.*
   FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text);
-   element
---
-   Hello2a2   bbbCC  
+ element 
+-
+   Hello2a2   bbbxxxCC  
 ]]>
 
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 1cec168b2a..df7f0cc20d 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3674,15 +3674,15 @@ SPI_sql_row_to_xmlelement(uint64 rownum, StringInfo result, char *tablename,
 #ifdef USE_LIBXML
 
 /*
- * Convert XML node to text (dump subtree in case of element,
- * return value otherwise)
+ * Convert XML node to text (dump subtree), for attribute and text
+ * returns escaped text.
  */
 static text *
 xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 {
 	xmltype*result;
 
-	if (cur->type == XML_ELEMENT_NODE)
+	if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
 	{
 		xmlBufferPtr buf;
 		xmlNodePtr	cur_copy;
@@ -4427,6 +4427,37 @@ XmlTableFetchRow(TableFuncScanState *state)
 #endif			/* not USE_LIBXML */
 }
 
+#ifdef USE_LIBXML
+/*
+ * Copy XmlChar string to PostgreSQL memory. Ensure releasing of
+ * source xmllib string.
+ */
+static char *
+copy_and_safe_free_xmlchar(xmlChar *str)
+{
+	char *result;
+
+	if (str)
+	{
+		PG_TRY();
+		{
+			result = pstrdup((char *) str);
+		}
+		PG_CATCH();
+		{
+			xmlFree(str);
+			PG_RE_THROW();
+		}
+		PG_END_TRY();
+		xmlFree(str);
+	}
+	else
+		result = NULL;
+
+	return result;
+}
+#endif
+
 /*
  * XmlTableGetValue
  *		Return the value for column number 'colnum' for the current row.  If
@@ -4475,9 +4506,9 @@ XmlTableGetValue(TableFuncScanState *state, int colnum,
 		/*
 		 * There are four possible cases, depending on the number of nodes
 		 * returned by the XPath expression and the type of the target column:
-		 * a) XPath returns no nodes.  b) One node is returned, and column is
-		 * of type XML.  c) One node, column type other than XML.  d) Multiple
-		 * nodes are returned.
+		 * a) XPath returns no nodes.  b) The target type is XML (return all
+		 * as XML).  For non-XML types:  c) One node (return content).
+		 * d) Multiple nodes (error).
 		 */
 		if (xpathobj->type == XPATH_NODESET)
 		{
@@ -4490,85 +4521,72 @@ XmlTableGetValue(TableFuncScanState *state, int colnum,
 			{
 *isnull = true;
 			}
-			else if (count == 1 && typid == XMLOID)
-			{
-text	   *textstr;
-
-/* simple case, result is one value */
-textstr = xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[0],
-			   xtCxt->xmlerrcxt);
-cstr = text_to_cstring(textstr);
-			}
-			else if (count == 1)
+			else
 			{
-xmlChar*str;
-xmlNodePtr	node;
-
-/*
- * Most nodes (elements and even attributes) store their data
- * in children nodes. If they don't have children nodes, it
- * means that they are empty (e.g. ). Text nodes and
- * CDATA sections are an exception: they don't have children
- * but have content in the Text/CDATA node itself.
- */
-node = xpathobj->nodesetval->nodeTab[0];
-if (node->type != XML_CDATA_SECTION_NODE &&
-	node->type != XML_TEXT_NODE)
-	node = node->xmlChildrenNode;
-
-str = xmlNodeListGetString(xtCxt->doc, node, 1);
-if (str != NULL)
+if (typid == XMLOID)
 {
-	PG_TRY();
-	{
-		cstr = pstrdup((char *) str);
-	}
-	PG_CATCH();
+	text	   *textstr;
+	StringInfoData str;
+	int			i;
+
+	/* Concatenate serialized values */
+	initStringInfo(&str);

Re: PostgreSQL vs SQL/XML Standards

2019-01-10 Thread Arthur Zakirov


Hello Pavel,

On 09.11.2018 07:07, Pavel Stehule wrote:
I used your patch and append regress tests. I checked the result against 
Oracle.


I checked the patch with Chap cases. The patch fixes handling of 
boolean, number types which mentioned in the wiki.


I have a few comments related to the code and the documentation. I 
attached the patch, which fixes it.


There is an example in the documentation:

SELECT xmltable.*
  FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text);
   element
--
   Hello2a2   bbbCC

With the patch XMLTABLE returns different result now.

copy_and_safe_free_xmlchar() function should be hid by #ifdef 
USE_LIBXML, otherwise I get an error if I build the Postgres without 
--with-libxml.


There is a comment within XmlTableGetValue(). I changed it, mainly I 
used Markus patch from the related thread mentioned by Alvaro.


Please see the changes in the patch.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 90d67f1acf..06f3f69073 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10999,9 +10999,9 @@ $$ AS data;
 
 SELECT xmltable.*
   FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text);
-   element
---
-   Hello2a2   bbbCC  
+ element 
+-
+   Hello2a2   bbbxxxCC  
 ]]>
 
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index a83882f5de..df7f0cc20d 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -4427,6 +4427,7 @@ XmlTableFetchRow(TableFuncScanState *state)
 #endif			/* not USE_LIBXML */
 }
 
+#ifdef USE_LIBXML
 /*
  * Copy XmlChar string to PostgreSQL memory. Ensure releasing of
  * source xmllib string.
@@ -4455,6 +4456,7 @@ copy_and_safe_free_xmlchar(xmlChar *str)
 
 	return result;
 }
+#endif
 
 /*
  * XmlTableGetValue
@@ -4504,9 +4506,9 @@ XmlTableGetValue(TableFuncScanState *state, int colnum,
 		/*
 		 * There are four possible cases, depending on the number of nodes
 		 * returned by the XPath expression and the type of the target column:
-		 * a) XPath returns no nodes.  b) One node is returned, and column is
-		 * of type XML.  c) One node, column type other than XML.  d) Multiple
-		 * nodes are returned.
+		 * a) XPath returns no nodes.  b) The target type is XML (return all
+		 * as XML).  For non-XML types:  c) One node (return content).
+		 * d) Multiple nodes (error).
 		 */
 		if (xpathobj->type == XPATH_NODESET)
 		{

Re: PostgreSQL vs SQL/XML Standards

2018-12-30 Thread Pavel Stehule

po 31. 12. 2018 v 3:15 odesílatel Chapman Flack 
napsal:

> On 12/30/18 15:18, Pavel Stehule wrote:
> > ne 30. 12. 2018 v 20:06 odesílatel Chapman Flack 
> > napsal:
> >> How difficult would it be to make the grammar constructs that currently
> >> accept "BY REF" (only as noise and ignore it) accept and ignore both BY
> REF
> >> and BY VALUE?
> >
> > This is difficult question - a implementation is probably very easy, but
> it
> > is hard to accept to possible break compatibility due syntactic sugar.
> >
> > This is not probably related to just XPath/XQuery question - but it is
> > related to different design of XML datatype (based on PostgreSQL TOAST)
> > against ANSI/SQL (Oracle - clob).
> >
> > So this is complicated topic and my opinion is better to don't touch it
> > because we can't to fix it, change it - and I am not sure so ANSI/SQL is
> > significantly better than PostgreSQL implementation.
>
> I am not sure I understand your point. It appears that Oracle (18c),
> just like PostgreSQL, really only supports BY VALUE semantics. Here is
> an Oracle fiddle that shows it:
>
>
> https://dbfiddle.uk/?rdbms=oracle_18&fiddle=0cb353da0d94c6d5c2659222a1e419fd
>
> When the same element is passed via two parameters, an 'is' test (node
> identity equality) of the two parameters returns false, indicating that
> Oracle has used BY VALUE semantics, not BY REF.
>
> Oracle uses BY VALUE when BY VALUE is explicitly requested, and also when
> no passing method is specified (i.e., BY VALUE is the default). Oracle also
> uses BY VALUE when BY REF is explicitly requested, which seems rather rude,
> but that must be the behavior PostgreSQL is imitating with the choice to
> accept and ignore BY REF.
>
> But the PostgreSQL situation is a little more strange. PG uses BY VALUE
> semantics as the default when no passing method is specified. PG also uses
> BY VALUE semantics when BY REF is explicitly requested, which is rude,
> just like Oracle. But why should an explicit specification of BY VALUE
> (which is, after all, the semantics we're going to use anyway!) produce
> this?
>
> ERROR:  syntax error at or near "value"
>
> To me, that doesn't seem like least astonishment.
>
> I am not seeing what would be complicated about removing that astonishment
> by simply allowing the grammar productions to also consume BY VALUE and
> ignore it.
>

ok - I am not against implementation of ignored BY VALUE. But I don't like
a idea to disable BY REF.

Regards

Pavel

>
> -Chap
>

Re: PostgreSQL vs SQL/XML Standards

2018-12-30 Thread Chapman Flack

On 12/30/18 15:18, Pavel Stehule wrote:
> ne 30. 12. 2018 v 20:06 odesílatel Chapman Flack 
> napsal:
>> How difficult would it be to make the grammar constructs that currently
>> accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF
>> and BY VALUE?
> 
> This is difficult question - a implementation is probably very easy, but it
> is hard to accept to possible break compatibility due syntactic sugar.
> 
> This is not probably related to just XPath/XQuery question - but it is
> related to different design of XML datatype (based on PostgreSQL TOAST)
> against ANSI/SQL (Oracle - clob).
> 
> So this is complicated topic and my opinion is better to don't touch it
> because we can't to fix it, change it - and I am not sure so ANSI/SQL is
> significantly better than PostgreSQL implementation.

I am not sure I understand your point. It appears that Oracle (18c),
just like PostgreSQL, really only supports BY VALUE semantics. Here is
an Oracle fiddle that shows it:

https://dbfiddle.uk/?rdbms=oracle_18&fiddle=0cb353da0d94c6d5c2659222a1e419fd

When the same element is passed via two parameters, an 'is' test (node
identity equality) of the two parameters returns false, indicating that
Oracle has used BY VALUE semantics, not BY REF.

Oracle uses BY VALUE when BY VALUE is explicitly requested, and also when
no passing method is specified (i.e., BY VALUE is the default). Oracle also
uses BY VALUE when BY REF is explicitly requested, which seems rather rude,
but that must be the behavior PostgreSQL is imitating with the choice to
accept and ignore BY REF.

But the PostgreSQL situation is a little more strange. PG uses BY VALUE
semantics as the default when no passing method is specified. PG also uses
BY VALUE semantics when BY REF is explicitly requested, which is rude,
just like Oracle. But why should an explicit specification of BY VALUE
(which is, after all, the semantics we're going to use anyway!) produce
this?

ERROR:  syntax error at or near "value"

To me, that doesn't seem like least astonishment.

I am not seeing what would be complicated about removing that astonishment
by simply allowing the grammar productions to also consume BY VALUE and
ignore it.

-Chap

Re: PostgreSQL vs SQL/XML Standards

2018-12-30 Thread Pavel Stehule

ne 30. 12. 2018 v 20:06 odesílatel Chapman Flack 
napsal:

> On 12/30/18 03:23, Pavel Stehule wrote:
> > I agree with more stronger detail description about difference between
> > XPath, XPath2 and XQuery.
> >
> > Some like "The XPath 1.0 is ancestor of XPath 2.0, that is part of
> XQuery.
> > ..."
>
> I'll be happy to work on some wording to help detail the differences.
> I don't think it can be condensed to a single sentence, or even if it
> could, it would not serve users well; what's helpful is to know
> the specifics of what changed between XPath 1.0 and the 2.0-3.0-3.1 lineage
> that is now XPath, because that's what helps to understand what the
> challenges will be when porting queries from other systems, how to
> recognize
> when a query won't be possible to port at all, etc.
>
> So I think it needs to be a short section. I am still undecided what's
> the best place in the manual for the section to go:
>
>
+1


> - Added to the introductory material under functions-xml
>
> - A long  attached to a short remark in that introductory matter
>   (is there any precedent? I see there are s already in the
>   manual, but I haven't checked how long they get)
>
> - A new appendix (or new section in the SQL Conformance appendix)
>   linked from a short statement in the functions-xml introductory matter.
>
> > I don't think so link to wiki with any proposals is good idea.
>
> That makes sense. I was thinking only of linking directly to the specific
> section on XPath 1.0 compatibility issues, not to the wiki page as a whole.
> I see in grepping through the *.sgml that there are not many links to the
> wiki.postgresql.org, and the ones that exist are only in release notes,
> Further Information, The Source Code Repository, and sepgsql.
>
> So precedent seems to stand against using a link to the wiki section.
> So, the material /in/ that wiki section [1] (but omitting the two paras
> about the DOCUMENT/CONTENT node-wrapping hack) is more or less what needs
> to go into the new section in the manual.
>
> I assume it's ok to directly include links [2] and [3] out to the w3.org
> compatibility notes ... those are stable URLs to reference material.
>
> === BY REF and BY VALUE ===
>
> How difficult would it be to make the grammar constructs that currently
> accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF
> and BY VALUE? The documentation ought to state that, while both forms are
> accepted and ignored, PostgreSQL's actual behavior corresponds to what the
> standard calls BY VALUE (using a string serialization as the XML datatype's
> internal form makes the BY REF semantics impossible). So it's when people
> try to port queries that explicitly say BY REF that they need to know there
> may be trouble ahead.
>

This is difficult question - a implementation is probably very easy, but it
is hard to accept to possible break compatibility due syntactic sugar.

This is not probably related to just XPath/XQuery question - but it is
related to different design of XML datatype (based on PostgreSQL TOAST)
against ANSI/SQL (Oracle - clob).

So this is complicated topic and my opinion is better to don't touch it
because we can't to fix it, change it - and I am not sure so ANSI/SQL is
significantly better than PostgreSQL implementation.


>
> -Chap
>
>
> >> [1]
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
> >> [2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
> >> [3]
> https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility
>

Re: PostgreSQL vs SQL/XML Standards

2018-12-30 Thread Chapman Flack

On 12/30/18 03:23, Pavel Stehule wrote:
> I agree with more stronger detail description about difference between
> XPath, XPath2 and XQuery.
> 
> Some like "The XPath 1.0 is ancestor of XPath 2.0, that is part of XQuery.
> ..."

I'll be happy to work on some wording to help detail the differences.
I don't think it can be condensed to a single sentence, or even if it
could, it would not serve users well; what's helpful is to know
the specifics of what changed between XPath 1.0 and the 2.0-3.0-3.1 lineage
that is now XPath, because that's what helps to understand what the
challenges will be when porting queries from other systems, how to recognize
when a query won't be possible to port at all, etc.

So I think it needs to be a short section. I am still undecided what's
the best place in the manual for the section to go:

- Added to the introductory material under functions-xml

- A long  attached to a short remark in that introductory matter
  (is there any precedent? I see there are s already in the
  manual, but I haven't checked how long they get)

- A new appendix (or new section in the SQL Conformance appendix)
  linked from a short statement in the functions-xml introductory matter.

> I don't think so link to wiki with any proposals is good idea.

That makes sense. I was thinking only of linking directly to the specific
section on XPath 1.0 compatibility issues, not to the wiki page as a whole.
I see in grepping through the *.sgml that there are not many links to the
wiki.postgresql.org, and the ones that exist are only in release notes,
Further Information, The Source Code Repository, and sepgsql.

So precedent seems to stand against using a link to the wiki section.
So, the material /in/ that wiki section [1] (but omitting the two paras
about the DOCUMENT/CONTENT node-wrapping hack) is more or less what needs
to go into the new section in the manual.

I assume it's ok to directly include links [2] and [3] out to the w3.org
compatibility notes ... those are stable URLs to reference material.

=== BY REF and BY VALUE ===

How difficult would it be to make the grammar constructs that currently
accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF
and BY VALUE? The documentation ought to state that, while both forms are
accepted and ignored, PostgreSQL's actual behavior corresponds to what the
standard calls BY VALUE (using a string serialization as the XML datatype's
internal form makes the BY REF semantics impossible). So it's when people
try to port queries that explicitly say BY REF that they need to know there
may be trouble ahead.

-Chap

>> [1] 
>> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
>> [2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
>> [3] 
>> https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility

Re: PostgreSQL vs SQL/XML Standards

2018-12-30 Thread Pavel Stehule

ne 30. 12. 2018 v 3:46 odesílatel Chapman Flack 
napsal:

> On 11/12/18 04:58, Pavel Stehule wrote:
> > It is assigned to January commitfest.
>
> When I build this patch against master (4203842), it builds, but breaks
> make check. The diffs seem only incidental (loss of some error context
> output), but there are no make check errors building 4203842 unmodified,
> so the patch has introduced them. Example:
>
> *** ../postgresql/src/test/regress/expected/xml.outSat Dec 29
> xml19:58:41 2018
> --- src/test/regress/results/xml.out   Sat Dec 29 19:59:11 2018
> ***
> *** 9,16 
>   LINE 1: INSERT INTO xmltest VALUES (3, '  ^
>   DETAIL:  line 1: Couldn't find end of Start Tag wrong line 1
> -  -   ^
>

Unfortunately, there is a different releases of libxml2 with different
error reporting and it is hard (impossible) to prepare for all variants. :-/

I prepare xml.out for my FC29 (fresh libxml2) and for no support xml. Other
I prepare by patching - and this error (in context) is expected.


> === ABOUT THE DOCUMENTATION ===
>
> While this patch fixes bugs in the embedding of XPath 1.0 into PostgreSQL,
> and that is good, I still think there is an elephant in the room, namely
> that XPath 1.0 isn't the right language to embed. Clearly, that can't be
> fixed as long as we rely on libxml, but I think our documentation could
> (should) do a better job of warning users of the surprises that result.
>
> My ideas for improving the documentation are not yet so finished as to
> offer a patch, but here are some thoughts:
>
> Of the 13 mentions in func.sgml of XPath (not counting its appearances
> in function and parameter names), only one specifically says XPath 1.0,
> and it is buried down at functions-xml-processing, not somewhere common
> to all of them, like the introduction to functions-xml.
>
> Really, all of our functions that refer to "XPath" mean XPath 1.0,
> and because that version was essentially thrown away to produce the 2.0
> and later versions, and that happened a dozen years ago, the documentation
> needs to make that clear. Users who have learned XQuery, or XPath in the
> last dozen years, need to know this is something very different, to avoid
> walking into the traps.
>
> I think:
>
> 1) Every reference to "XPath" or an "XPath expression" (that is the first
>in the description of a function or parameter) should be spelled out as
>"XPath 1.0".
>

> 2) There should be a section in one central place (clearly applying to
>all of the XML functions), perhaps in the functions-xml introduction,
>or a  to a paragraph there, or in an appendix (new section
>under SQL Conformance?) that summarizes the gulf between XPath 1.0
>and the later, XQuery-compatible versions. There should be cross-
>reference links to this section from the "XPath 1.0" mentions in
>individual function descriptions. The section could be a condensation
>of [1] (and perhaps contain a link to [1], if links out to the wiki
>are allowed), and link to the W3C compatibility notes at [2] and [3].
>
> 3) Currently, the description of XMLEXISTS notes that "the SQL standard
>specifies ... the construct to take an XQuery expression ... but
>PostgreSQL currently only supports XPath, which is a subset of XQuery".
>The description of XMLTABLE does not have such a note, and needs one.
>In both cases, the note should probably be nearer the top of the
>description (for XMLEXISTS, it is currently at the bottom, after the
>examples).
>
> 4) That note currently says "only supports XPath, which is a subset
>of XQuery". That would be a fair claim if we meant XPath 2.0
>or later, but not XPath 1.0 (which was not a subset of any version
>of XQuery, and has fundamental incompatibilities with it). The note,
>for both XMLEXISTS and XMLTABLE, should simply say "only supports
>XPath 1.0", with a link to the section summarizing the
> incompatibilities.
>
> Those are my thoughts on how a patch to the documentation could be
> organized. Do they seem reasonable?
>

I agree with more stronger detail description about difference between
XPath, XPath2 and XQuery.

Some like "The XPath 1.0 is ancestor of XPath 2.0, that is part of XQuery.
..."

I don't think so link to wiki with any proposals is good idea.
Documentation should to describe current state, not what can be at some
future.

Regards

Pavel


> -Chap
>
>
> [1]
>
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
> [2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
> [3]
>
> https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility
>

Re: PostgreSQL vs SQL/XML Standards

2018-12-29 Thread Chapman Flack

On 11/12/18 04:58, Pavel Stehule wrote:
> It is assigned to January commitfest.

When I build this patch against master (4203842), it builds, but breaks
make check. The diffs seem only incidental (loss of some error context
output), but there are no make check errors building 4203842 unmodified,
so the patch has introduced them. Example:

*** ../postgresql/src/test/regress/expected/xml.outSat Dec 29 19:58:41 2018
--- src/test/regress/results/xml.out   Sat Dec 29 19:59:11 2018
***
*** 9,16 
  LINE 1: INSERT INTO xmltest VALUES (3, ' to a paragraph there, or in an appendix (new section
   under SQL Conformance?) that summarizes the gulf between XPath 1.0
   and the later, XQuery-compatible versions. There should be cross-
   reference links to this section from the "XPath 1.0" mentions in
   individual function descriptions. The section could be a condensation
   of [1] (and perhaps contain a link to [1], if links out to the wiki
   are allowed), and link to the W3C compatibility notes at [2] and [3].

3) Currently, the description of XMLEXISTS notes that "the SQL standard
   specifies ... the construct to take an XQuery expression ... but
   PostgreSQL currently only supports XPath, which is a subset of XQuery".
   The description of XMLTABLE does not have such a note, and needs one.
   In both cases, the note should probably be nearer the top of the
   description (for XMLEXISTS, it is currently at the bottom, after the
   examples).

4) That note currently says "only supports XPath, which is a subset
   of XQuery". That would be a fair claim if we meant XPath 2.0
   or later, but not XPath 1.0 (which was not a subset of any version
   of XQuery, and has fundamental incompatibilities with it). The note,
   for both XMLEXISTS and XMLTABLE, should simply say "only supports
   XPath 1.0", with a link to the section summarizing the incompatibilities.

Those are my thoughts on how a patch to the documentation could be
organized. Do they seem reasonable?

-Chap


[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
[2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
[3]
https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility

Re: PostgreSQL vs SQL/XML Standards

2018-11-12 Thread Pavel Stehule

po 12. 11. 2018 v 6:58 odesílatel Markus Winand 
napsal:

>
> On 2018-11-9, at 05:07 , Pavel Stehule  wrote:
>
>
>
> čt 8. 11. 2018 v 15:18 odesílatel Markus Winand 
> napsal:
>
>>
>> > On 2018-11-6, at 15:23 , Pavel Stehule  wrote:
>> >
>> >
>> >
>> > po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <
>> pavel.steh...@gmail.com> napsal:
>> >
>> >
>> > po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <
>> pavel.steh...@gmail.com> napsal:
>> > Hi
>> >
>> > čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <
>> alvhe...@2ndquadrant.com> napsal:
>> > On 2018-Oct-25, Pavel Stehule wrote:
>> >
>> > > I am thinking so I can fix some issues related to XMLTABLE. Please,
>> send me
>> > > more examples and test cases.
>> >
>> > Please see Markus Winand's patch that I referenced upthread.
>> >
>> > here is a fix of some XMLTABLE mentioned issues.
>> >
>> > this update allows cast boolean to numeric types from XPath expressions
>> >
>> > Attached patch solves some cast issues mentioned by Chap. It solves
>> issue reported by Markus. I didn't use Markus's code, but it was
>> inspiration for me. I found native solution from libxml2.
>> >
>> > Regards
>> >
>> > Pavel
>>
>> Better than my patch.
>>
>> But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed
>> — in one way or the other (see below).
>>
>> # select * from xmltable('*' PASSING 'pre> arg?>&deeppost' COLUMNS x XML PATH
>> 'node()');
>> x
>> -
>>  prec1arg&ent1&deeppost
>> (1 row)
>>
>> Output is not the original XML.
>>
>> I dug a little further and found another case that doesn’t looks right
>> even with my change to xml_xmlnodetoxmltype applied:
>>
>> # select * from xmltable('*' PASSING 'pre> arg?>&deeppost' COLUMNS x XML PATH '/');
>>  x
>> ---
>>  pre&ent1&deeppost
>> (1 row)
>>
>> Oracle gives in both cases XML.
>>
>> To fix that I included XML_DOCUMENT_NODE in the list of nodes that use
>> xmlNodeDump. Now I wonder if that logic should be reversed to use the
>> xmlXPathCastNodeToString branch in a few selected cases but default to the
>> branch xmlNodeDump for all other cases?
>>
>> I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE.
>> Regression tests are happy with that approach but I don’t think that proves
>> a lot.
>>
>> -markus
>>
>> diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
>> index 37d85f7..7c1f884 100644
>> --- a/src/backend/utils/adt/xml.c
>> +++ b/src/backend/utils/adt/xml.c
>> @@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur,
>> PgXmlErrorContext *xmlerrcxt)
>>  {
>> xmltype*result;
>>
>> -   if (cur->type == XML_ELEMENT_NODE)
>> +   if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
>> {
>> xmlBufferPtr buf;
>> xmlNodePtr  cur_copy;
>>
>>
> I used your patch and append regress tests. I checked the result against
> Oracle.
>
> Regards
>
> Pavel
>
>
> Fine from my side.
>

super

Thank you for investigation and tests, examples.

It is assigned to January commitfest.

Regards

Pavel


> -markus
>
>

Re: PostgreSQL vs SQL/XML Standards

2018-11-11 Thread Markus Winand


> On 2018-11-9, at 05:07 , Pavel Stehule  wrote:
> 
> 
> 
> čt 8. 11. 2018 v 15:18 odesílatel Markus Winand  > napsal:
> 
> > On 2018-11-6, at 15:23 , Pavel Stehule  > > wrote:
> > 
> > 
> > 
> > po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule  > > napsal:
> > 
> > 
> > po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule  > > napsal:
> > Hi
> > 
> > čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera  > > napsal:
> > On 2018-Oct-25, Pavel Stehule wrote:
> > 
> > > I am thinking so I can fix some issues related to XMLTABLE. Please, send 
> > > me
> > > more examples and test cases.
> > 
> > Please see Markus Winand's patch that I referenced upthread.
> > 
> > here is a fix of some XMLTABLE mentioned issues.
> > 
> > this update allows cast boolean to numeric types from XPath expressions
> > 
> > Attached patch solves some cast issues mentioned by Chap. It solves issue 
> > reported by Markus. I didn't use Markus's code, but it was inspiration for 
> > me. I found native solution from libxml2.
> > 
> > Regards
> > 
> > Pavel
> 
> Better than my patch.
> 
> But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — 
> in one way or the other (see below).
> 
> # select * from xmltable('*' PASSING 'pre arg?>&deeppost' COLUMNS x XML PATH 
> 'node()');
> x
> -
>  prec1arg&ent1&deeppost
> (1 row)
> 
> Output is not the original XML.
> 
> I dug a little further and found another case that doesn’t looks right even 
> with my change to xml_xmlnodetoxmltype applied:
> 
> # select * from xmltable('*' PASSING 'pre arg?>&deeppost' COLUMNS x XML PATH '/');
>  x
> ---
>  pre&ent1&deeppost
> (1 row)
> 
> Oracle gives in both cases XML.
> 
> To fix that I included XML_DOCUMENT_NODE in the list of nodes that use 
> xmlNodeDump. Now I wonder if that logic should be reversed to use the 
> xmlXPathCastNodeToString branch in a few selected cases but default to the 
> branch xmlNodeDump for all other cases?
> 
> I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. 
> Regression tests are happy with that approach but I don’t think that proves a 
> lot.
> 
> -markus
> 
> diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
> index 37d85f7..7c1f884 100644
> --- a/src/backend/utils/adt/xml.c
> +++ b/src/backend/utils/adt/xml.c
> @@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext 
> *xmlerrcxt)
>  {
> xmltype*result;
> 
> -   if (cur->type == XML_ELEMENT_NODE)
> +   if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
> {
> xmlBufferPtr buf;
> xmlNodePtr  cur_copy;
> 
> 
> I used your patch and append regress tests. I checked the result against 
> Oracle.
> 
> Regards
> 
> Pavel

Fine from my side.

-markus

Re: PostgreSQL vs SQL/XML Standards

2018-11-08 Thread Pavel Stehule

čt 8. 11. 2018 v 15:18 odesílatel Markus Winand 
napsal:

>
> > On 2018-11-6, at 15:23 , Pavel Stehule  wrote:
> >
> >
> >
> > po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <
> pavel.steh...@gmail.com> napsal:
> >
> >
> > po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <
> pavel.steh...@gmail.com> napsal:
> > Hi
> >
> > čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <
> alvhe...@2ndquadrant.com> napsal:
> > On 2018-Oct-25, Pavel Stehule wrote:
> >
> > > I am thinking so I can fix some issues related to XMLTABLE. Please,
> send me
> > > more examples and test cases.
> >
> > Please see Markus Winand's patch that I referenced upthread.
> >
> > here is a fix of some XMLTABLE mentioned issues.
> >
> > this update allows cast boolean to numeric types from XPath expressions
> >
> > Attached patch solves some cast issues mentioned by Chap. It solves
> issue reported by Markus. I didn't use Markus's code, but it was
> inspiration for me. I found native solution from libxml2.
> >
> > Regards
> >
> > Pavel
>
> Better than my patch.
>
> But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed
> — in one way or the other (see below).
>
> # select * from xmltable('*' PASSING 'pre arg?>&deeppost' COLUMNS x XML PATH
> 'node()');
> x
> -
>  prec1arg&ent1&deeppost
> (1 row)
>
> Output is not the original XML.
>
> I dug a little further and found another case that doesn’t looks right
> even with my change to xml_xmlnodetoxmltype applied:
>
> # select * from xmltable('*' PASSING 'pre arg?>&deeppost' COLUMNS x XML PATH '/');
>  x
> ---
>  pre&ent1&deeppost
> (1 row)
>
> Oracle gives in both cases XML.
>
> To fix that I included XML_DOCUMENT_NODE in the list of nodes that use
> xmlNodeDump. Now I wonder if that logic should be reversed to use the
> xmlXPathCastNodeToString branch in a few selected cases but default to the
> branch xmlNodeDump for all other cases?
>
> I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE.
> Regression tests are happy with that approach but I don’t think that proves
> a lot.
>
> -markus
>
> diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
> index 37d85f7..7c1f884 100644
> --- a/src/backend/utils/adt/xml.c
> +++ b/src/backend/utils/adt/xml.c
> @@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur,
> PgXmlErrorContext *xmlerrcxt)
>  {
> xmltype*result;
>
> -   if (cur->type == XML_ELEMENT_NODE)
> +   if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
> {
> xmlBufferPtr buf;
> xmlNodePtr  cur_copy;
>
>
I used your patch and append regress tests. I checked the result against
Oracle.

Regards

Pavel
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f71f3..7bed508e2a 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3674,15 +3674,15 @@ SPI_sql_row_to_xmlelement(uint64 rownum, StringInfo result, char *tablename,
 #ifdef USE_LIBXML
 
 /*
- * Convert XML node to text (dump subtree in case of element,
- * return value otherwise)
+ * Convert XML node to text (dump subtree), for attribute and text
+ * returns escaped text.
  */
 static text *
 xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 {
 	xmltype*result;
 
-	if (cur->type == XML_ELEMENT_NODE)
+	if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
 	{
 		xmlBufferPtr buf;
 		xmlNodePtr	cur_copy;
@@ -4427,6 +4427,35 @@ XmlTableFetchRow(TableFuncScanState *state)
 #endif			/* not USE_LIBXML */
 }
 
+/*
+ * Copy XmlChar string to PostgreSQL memory. Ensure releasing of
+ * source xmllib string.
+ */
+static char *
+copy_and_safe_free_xmlchar(xmlChar *str)
+{
+	char *result;
+
+	if (str)
+	{
+		PG_TRY();
+		{
+			result = pstrdup((char *) str);
+		}
+		PG_CATCH();
+		{
+			xmlFree(str);
+			PG_RE_THROW();
+		}
+		PG_END_TRY();
+		xmlFree(str);
+	}
+	else
+		result = NULL;
+
+	return result;
+}
+
 /*
  * XmlTableGetValue
  *		Return the value for column number 'colnum' for the current row.  If
@@ -4490,85 +4519,72 @@ XmlTableGetValue(TableFuncScanState *state, int colnum,
 			{
 *isnull = true;
 			}
-			else if (count == 1 && typid == XMLOID)
-			{
-text	   *textstr;
-
-/* simple case, result is one value */
-textstr = xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[0],
-			   xtCxt->xmlerrcxt);
-cstr = text_to_cstring(textstr);
-			}
-			else if (count == 1)
+			else
 			{
-xmlChar*str;
-xmlNodePtr	node;
-
-/*
- * Most nodes (elements and even attributes) store their data
- * in children nodes. If they don't have children nodes, it
- * means that they are empty (e.g. ). Text nodes and
- * CDATA sections are an exception: they don't have children
- * but have content in the Text/CDATA node itself.
- */
-node = xpathobj->nodesetval->nodeTab[0];
-

Re: PostgreSQL vs SQL/XML Standards

2018-11-08 Thread Markus Winand

> On 2018-11-6, at 15:23 , Pavel Stehule  wrote:
> 
> 
> 
> po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule  
> napsal:
> 
> 
> po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule  
> napsal:
> Hi
> 
> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera  
> napsal:
> On 2018-Oct-25, Pavel Stehule wrote:
> 
> > I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> > more examples and test cases.
> 
> Please see Markus Winand's patch that I referenced upthread.
> 
> here is a fix of some XMLTABLE mentioned issues.
> 
> this update allows cast boolean to numeric types from XPath expressions
> 
> Attached patch solves some cast issues mentioned by Chap. It solves issue 
> reported by Markus. I didn't use Markus's code, but it was inspiration for 
> me. I found native solution from libxml2.
> 
> Regards
> 
> Pavel

Better than my patch.

But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — in 
one way or the other (see below).

# select * from xmltable('*' PASSING 'pre&deeppost' COLUMNS x XML PATH 'node()');
x
-
 prec1arg&ent1&deeppost
(1 row)

Output is not the original XML.

I dug a little further and found another case that doesn’t looks right even 
with my change to xml_xmlnodetoxmltype applied:

# select * from xmltable('*' PASSING 'pre&deeppost' COLUMNS x XML PATH '/');
 x
---
 pre&ent1&deeppost
(1 row)

Oracle gives in both cases XML.

To fix that I included XML_DOCUMENT_NODE in the list of nodes that use 
xmlNodeDump. Now I wonder if that logic should be reversed to use the 
xmlXPathCastNodeToString branch in a few selected cases but default to the 
branch xmlNodeDump for all other cases?

I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. 
Regression tests are happy with that approach but I don’t think that proves a 
lot.

-markus

diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f7..7c1f884 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext 
*xmlerrcxt)
 {
xmltype*result;

-   if (cur->type == XML_ELEMENT_NODE)
+   if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
{
xmlBufferPtr buf;
xmlNodePtr  cur_copy;

Re: PostgreSQL vs SQL/XML Standards

2018-11-06 Thread Pavel Stehule

po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule 
napsal:

>
>
> po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule 
> napsal:
>
>> Hi
>>
>> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <
>> alvhe...@2ndquadrant.com> napsal:
>>
>>> On 2018-Oct-25, Pavel Stehule wrote:
>>>
>>> > I am thinking so I can fix some issues related to XMLTABLE. Please,
>>> send me
>>> > more examples and test cases.
>>>
>>> Please see Markus Winand's patch that I referenced upthread.
>>>
>>
>> here is a fix of some XMLTABLE mentioned issues.
>>
>
> this update allows cast boolean to numeric types from XPath expressions
>

Attached patch solves some cast issues mentioned by Chap. It solves issue
reported by Markus. I didn't use Markus's code, but it was inspiration for
me. I found native solution from libxml2.

Regards

Pavel


>
> Regards
>
> Pavel
>
>
>> Regards
>>
>> Pavel
>>
>>
>>
>>>
>>> --
>>> Álvaro Herrerahttps://www.2ndQuadrant.com/
>>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>
>>
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f71f3..8a95045b87 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -4427,6 +4427,35 @@ XmlTableFetchRow(TableFuncScanState *state)
 #endif			/* not USE_LIBXML */
 }
 
+/*
+ * Copy XmlChar string to PostgreSQL memory. Ensure releasing of
+ * source xmllib string.
+ */
+static char *
+copy_and_safe_free_xmlchar(xmlChar *str)
+{
+	char *result;
+
+	if (str)
+	{
+		PG_TRY();
+		{
+			result = pstrdup((char *) str);
+		}
+		PG_CATCH();
+		{
+			xmlFree(str);
+			PG_RE_THROW();
+		}
+		PG_END_TRY();
+		xmlFree(str);
+	}
+	else
+		result = NULL;
+
+	return result;
+}
+
 /*
  * XmlTableGetValue
  *		Return the value for column number 'colnum' for the current row.  If
@@ -4490,85 +4519,72 @@ XmlTableGetValue(TableFuncScanState *state, int colnum,
 			{
 *isnull = true;
 			}
-			else if (count == 1 && typid == XMLOID)
-			{
-text	   *textstr;
-
-/* simple case, result is one value */
-textstr = xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[0],
-			   xtCxt->xmlerrcxt);
-cstr = text_to_cstring(textstr);
-			}
-			else if (count == 1)
+			else
 			{
-xmlChar*str;
-xmlNodePtr	node;
-
-/*
- * Most nodes (elements and even attributes) store their data
- * in children nodes. If they don't have children nodes, it
- * means that they are empty (e.g. ). Text nodes and
- * CDATA sections are an exception: they don't have children
- * but have content in the Text/CDATA node itself.
- */
-node = xpathobj->nodesetval->nodeTab[0];
-if (node->type != XML_CDATA_SECTION_NODE &&
-	node->type != XML_TEXT_NODE)
-	node = node->xmlChildrenNode;
-
-str = xmlNodeListGetString(xtCxt->doc, node, 1);
-if (str != NULL)
+if (typid == XMLOID)
 {
-	PG_TRY();
-	{
-		cstr = pstrdup((char *) str);
-	}
-	PG_CATCH();
+	text	   *textstr;
+	StringInfoData str;
+	int			i;
+
+	/* Concatenate serialized values */
+	initStringInfo(&str);
+	for (i = 0; i < count; i++)
 	{
-		xmlFree(str);
-		PG_RE_THROW();
+		textstr =
+			xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[i],
+ xtCxt->xmlerrcxt);
+
+		appendStringInfoText(&str, textstr);
 	}
-	PG_END_TRY();
-	xmlFree(str);
+	cstr = str.data;
 }
 else
 {
-	/* Ensure mapping of empty tags to PostgreSQL values. */
-	cstr = "";
-}
-			}
-			else
-			{
-StringInfoData str;
-int			i;
-
-Assert(count > 1);
+	xmlChar	   *str;
 
-/*
- * When evaluating the XPath expression returns multiple
- * nodes, the result is the concatenation of them all. The
- * target type must be XML.
- */
-if (typid != XMLOID)
-	ereport(ERROR,
-			(errcode(ERRCODE_CARDINALITY_VIOLATION),
-			 errmsg("more than one value returned by column XPath expression")));
+	if (count > 1)
+		ereport(ERROR,
+(errcode(ERRCODE_CARDINALITY_VIOLATION),
+ errmsg("more than one value returned by column XPath expression")));
 
-/* Concatenate serialized values */
-initStringInfo(&str);
-for (i = 0; i < count; i++)
-{
-	appendStringInfoText(&str,
-		 xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[i],
-			  xtCxt->xmlerrcxt));
+	str = xmlXPathCastNodeSetToString(xpathobj->nodesetval);
+	if (str)
+		cstr = copy_and_safe_free_xmlchar(str);
+	else
+		/* empty element */
+		cstr = "";
 }
-cstr = str.data;
 			}
 		}
 		else if (xpathobj->type == XPATH_STRING)
 		{
 			cstr = (char *) xpathobj->stringval;
 		}
+		else if (xpathobj->type == XPATH_BOOLEAN)
+		{
+			char		typcategory;
+			bool		typispreferred;
+			xmlChar	   *str;
+
+			/* Allow implicit casting from boolean to numbers */
+			get_type_category_preferred(typid, &typcategory, &typispreferred);
+
+			if (typcate

Re: PostgreSQL vs SQL/XML Standards

2018-10-29 Thread Chapman Flack

On 10/29/18 6:40 AM, Thomas Kellerer wrote:
> That line seems to be valid - but you need to pass an XMLTYPE value,
> not a VARCHAR
> 
> https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=21cdf890a26e97fa8667b2d6a960bd33

Oh, of course! Thank you. I had forgotten pass the context item
as explicitly an XML value.

That illustrates that, in a proper XMLTABLE, you can pass things
that are not XML values. When a varchar is passed in, the context
item has type xs:string. The third PATH tried to follow a node path
against a non-node context item, and correctly reported the error.

And thanks for the dbfiddle pointer. I can now confirm (in both 11g.2
and 18c):

SELECT *
FROM XMLTABLE('.'
  PASSING xmltype('')
  COLUMNS
a varchar(10) PATH '"cat" < "dog"',
b varchar(10) PATH '"cat" > "dog"',
c varchar(10) PATH 'sale/@taxable = false()'
);

A   B   C
truefalse   true

Or as numbers (There's just no SQL boolean type in Oracle, even 18g!):

SELECT *
FROM XMLTABLE('.'
  PASSING xmltype('')
  COLUMNS
a NUMBER PATH '"cat" < "dog"',
b NUMBER PATH '"cat" > "dog"',
c NUMBER PATH 'sale/@taxable = false()'
);

A   B   C
1   0   1

I removed the string() wrappings, which were only to allow the
same query to work in PG, but Pavel's proposed patches will make
them unnecessary.

Note, however, that the first proposed patch will work for the
first query (varchar results) and fail for the second (number
results). The second patch will work for the second query, but
produce the wrong strings ("1" or "0" instead of "true" or "false")
for the first. A proper XMLTABLE needs to apply the appropriate
conversion determined by the SQL type of the output column.

I believe a patch to do that correctly is possible; xml.c has
access to the type oids for the output columns, after all.

The fact that PG will return false|false|false or 0|0|0 instead
of true|false|true or 1|0|1 cannot be fixed by a patch. That is
the consequence of evaluating in XPath 1.0 (in XPath 2.0, which is
a subset of XQuery, the results would be correct).

On the same lines, we can take my original example where I forgot
to type the context item as XML, and make that work in Oracle too:

SELECT *
FROM XMLTABLE('.'
  PASSING ''
  COLUMNS
a varchar(10) PATH 'substring-after(., "taxable=")'
);

A
"false"/>

A proper XMLTABLE is happy to be passed an atomic value, such as
a string, as the context item or any named parameter, and apply
type-appropriate operators and functions to it. XPath 1.0 blocks
that for PG.

-Chap

Re: PostgreSQL vs SQL/XML Standards

2018-10-29 Thread Pavel Stehule

po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule 
napsal:

> Hi
>
> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <
> alvhe...@2ndquadrant.com> napsal:
>
>> On 2018-Oct-25, Pavel Stehule wrote:
>>
>> > I am thinking so I can fix some issues related to XMLTABLE. Please,
>> send me
>> > more examples and test cases.
>>
>> Please see Markus Winand's patch that I referenced upthread.
>>
>
> here is a fix of some XMLTABLE mentioned issues.
>

this update allows cast boolean to numeric types from XPath expressions

Regards

Pavel


> Regards
>
> Pavel
>
>
>
>>
>> --
>> Álvaro Herrerahttps://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f71f3..a691dbf965 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -4569,6 +4569,41 @@ XmlTableGetValue(TableFuncScanState *state, int colnum,
 		{
 			cstr = (char *) xpathobj->stringval;
 		}
+		else if (xpathobj->type == XPATH_BOOLEAN)
+		{
+			char	typcategory;
+			bool	typispreferred;
+
+			/* Allow implicit casting from boolean to numbers */
+			get_type_category_preferred(typid, &typcategory, &typispreferred);
+
+			if (typcategory != TYPCATEGORY_NUMERIC)
+cstr = xpathobj->boolval ? "true" : "false";
+			else
+cstr = xpathobj->boolval ? "1" : "0";
+		}
+		else if (xpathobj->type == XPATH_NUMBER)
+		{
+			double fval = xpathobj->floatval;
+
+			switch (xmlXPathIsInf(fval))
+			{
+case 1:
+	cstr = "Infinity";
+	break;
+case -1:
+	cstr = "-Infinity";
+	break;
+default:
+	if (xmlXPathIsNaN(fval))
+		cstr = "NaN";
+	else if (fval == 0.0)
+		cstr = "0";
+	else
+		cstr = psprintf("%0g", fval);
+		
+			}
+		}
 		else
 			elog(ERROR, "unexpected XPath object type %u", xpathobj->type);
 
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 6e1f885112..4eb39f6169 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1493,3 +1493,16 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- XPath result can be boolean or number too
+SELECT * FROM XMLTABLE('*' PASSING 'a' COLUMNS a xml PATH '.', b text PATH '.', c text PATH '"hi"',  d boolean PATH '. = "a"');
+a | b | c  | d 
+--+---++---
+ a | a | hi | t
+(1 row)
+
+SELECT * FROM XMLTABLE('*' PASSING 'a' COLUMNS a xml PATH '.', b text PATH '.', c text PATH '"hi"',  d integer PATH 'string-length(.)');
+a | b | c  | d 
+--+---++---
+ a | a | hi | 1
+(1 row)
+
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index 3b91b56d5a..1128ddb90f 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -595,3 +595,7 @@ INSERT INTO xmltest2 VALUES('2', 'D');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
+
+-- XPath result can be boolean or number too
+SELECT * FROM XMLTABLE('*' PASSING 'a' COLUMNS a xml PATH '.', b text PATH '.', c text PATH '"hi"',  d boolean PATH '. = "a"');
+SELECT * FROM XMLTABLE('*' PASSING 'a' COLUMNS a xml PATH '.', b text PATH '.', c text PATH '"hi"',  d integer PATH 'string-length(.)');

Re: PostgreSQL vs SQL/XML Standards

2018-10-29 Thread Thomas Kellerer

>> I have a access to too old 11.2 Oracle.  There I had to modify query
>> because there is not boolean type. I replaced bool by int, but I got a
>> error
>> ORA-19224:XPTY-004 .. expected node()*, got xs:string - it doesn't work
>> with/without string() wrappings.
>>
>The problem is in last line - the expression  "sale/@taxable = false()" is
>not valid on Oracle. Using string() wrapping is a issue, because it returns
">true", "false", but Oracle int doesn't accept it.

That line seems to be valid - but you need to pass an XMLTYPE value, not a
VARCHAR

https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=21cdf890a26e97fa8667b2d6a960bd33

As far as I can tell inside XQuery Oracle does support boolean, but not as a
return type 





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: PostgreSQL vs SQL/XML Standards

2018-10-29 Thread Pavel Stehule

po 29. 10. 2018 v 11:05 odesílatel Pavel Stehule 
napsal:

>
> (It would not be exactly overloading, because of the special sugared
>>> syntax known to the parser, but it could look like overloading, and be
>>> intuitive to the user.)
>>>
>>> If you have convenient access to Oracle to check compatibility, could you
>>> compare this query?
>>>
>>> SELECT * FROM XMLTABLE('.'
>>> PASSING ''
>>> COLUMNS
>>> a boolean PATH 'string("cat" < "dog")',
>>> b boolean PATH 'string("cat" > "dog")',
>>> c boolean PATH 'string(sale/@taxable = false())');
>>>
>>> (I suspect in Oracle it would also work without the string() wrappings,
>>> but just to make it easy, I think this way it will work in both Oracle
>>> and PG—that is, not error, though results may differ.)
>>>
>>
>>
>>
> I have a access to too old 11.2 Oracle.  There I had to modify query
> because there is not boolean type. I replaced bool by int, but I got a error
> ORA-19224:XPTY-004 .. expected node()*, got xs:string - it doesn't work
> with/without string() wrappings.
>

The problem is in last line - the expression  "sale/@taxable = false()" is
not valid on Oracle. Using string() wrapping is a issue, because it returns
"true", "false", but Oracle int doesn't accept it.

Pavel


> Regards
>
> Pavel Stehule
>
>
>
>
>

Re: PostgreSQL vs SQL/XML Standards

2018-10-29 Thread Pavel Stehule

> (It would not be exactly overloading, because of the special sugared
>> syntax known to the parser, but it could look like overloading, and be
>> intuitive to the user.)
>>
>> If you have convenient access to Oracle to check compatibility, could you
>> compare this query?
>>
>> SELECT * FROM XMLTABLE('.'
>> PASSING ''
>> COLUMNS
>> a boolean PATH 'string("cat" < "dog")',
>> b boolean PATH 'string("cat" > "dog")',
>> c boolean PATH 'string(sale/@taxable = false())');
>>
>> (I suspect in Oracle it would also work without the string() wrappings,
>> but just to make it easy, I think this way it will work in both Oracle
>> and PG—that is, not error, though results may differ.)
>>
>
>
>
I have a access to too old 11.2 Oracle.  There I had to modify query
because there is not boolean type. I replaced bool by int, but I got a error
ORA-19224:XPTY-004 .. expected node()*, got xs:string - it doesn't work
with/without string() wrappings.

Regards

Pavel Stehule

Re: PostgreSQL vs SQL/XML Standards

2018-10-29 Thread Pavel Stehule

Hi

čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera 
napsal:

> On 2018-Oct-25, Pavel Stehule wrote:
>
> > I am thinking so I can fix some issues related to XMLTABLE. Please, send
> me
> > more examples and test cases.
>
> Please see Markus Winand's patch that I referenced upthread.
>

here is a fix of some XMLTABLE mentioned issues.

Regards

Pavel



>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f71f3..51b6905927 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -4569,6 +4569,31 @@ XmlTableGetValue(TableFuncScanState *state, int colnum,
 		{
 			cstr = (char *) xpathobj->stringval;
 		}
+		else if (xpathobj->type == XPATH_BOOLEAN)
+		{
+			cstr = xpathobj->boolval ? "true" : "false";
+		}
+		else if (xpathobj->type == XPATH_NUMBER)
+		{
+			double fval = xpathobj->floatval;
+
+			switch (xmlXPathIsInf(fval))
+			{
+case 1:
+	cstr = "Infinity";
+	break;
+case -1:
+	cstr = "-Infinity";
+	break;
+default:
+	if (xmlXPathIsNaN(fval))
+		cstr = "NaN";
+	else if (fval == 0.0)
+		cstr = "0";
+	else
+		cstr = psprintf("%0g", fval);
+			}
+		}
 		else
 			elog(ERROR, "unexpected XPath object type %u", xpathobj->type);
 
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 6e1f885112..4eb39f6169 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1493,3 +1493,16 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- XPath result can be boolean or number too
+SELECT * FROM XMLTABLE('*' PASSING 'a' COLUMNS a xml PATH '.', b text PATH '.', c text PATH '"hi"',  d boolean PATH '. = "a"');
+a | b | c  | d 
+--+---++---
+ a | a | hi | t
+(1 row)
+
+SELECT * FROM XMLTABLE('*' PASSING 'a' COLUMNS a xml PATH '.', b text PATH '.', c text PATH '"hi"',  d integer PATH 'string-length(.)');
+a | b | c  | d 
+--+---++---
+ a | a | hi | 1
+(1 row)
+
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index 3b91b56d5a..1128ddb90f 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -595,3 +595,7 @@ INSERT INTO xmltest2 VALUES('2', 'D');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
+
+-- XPath result can be boolean or number too
+SELECT * FROM XMLTABLE('*' PASSING 'a' COLUMNS a xml PATH '.', b text PATH '.', c text PATH '"hi"',  d boolean PATH '. = "a"');
+SELECT * FROM XMLTABLE('*' PASSING 'a' COLUMNS a xml PATH '.', b text PATH '.', c text PATH '"hi"',  d integer PATH 'string-length(.)');

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Pavel Stehule

pá 26. 10. 2018 v 6:25 odesílatel Chapman Flack 
napsal:

> On 10/25/18 23:16, Pavel Stehule wrote:
> >> XMLTABLE would be the headache. Using the standard name for something
> >> that ain't the standard function has not left any painless way that the
> >> standard function could be added. OTOH, it has only been in the wild
> >> since 10, so renaming it to something else (xpath_table?) will probably
> >> be more painless if done soon than it ever would be later.
> >>
> > I don't share your opinion. XMLTABLE implements subset of standard. More
> it
> > is well compatible with Oracle (in this subset).
> >
> > If we have library with XPath 2.0 or higher, we can continue with it.
>
> The difficulty here is that the expression language required by the
> standard
> is XQuery, and an XPath expression (whether 1.0 or 2.0+) can always be
> parsed as an XQuery expression. (So, /syntactically/, yes, "subset".)
>
> For XPath 2.0, that is no problem, because an XPath 2.0 expression and
> the identically-spelled XQuery expression /mean the same thing/.
>
> For XPath 1.0, it is very definitely a problem, because an XPath 1.0
> expression and the identically-spelled XQuery expression /do not mean
> the same thing/. Some of the important semantic differences are in [1].
>
> So, if a future PostgreSQL version has an XMLTABLE function that accepts
> XQuery, as the standard requires, and existing users upgrade and they have
> XMLTABLE query expressions written as XPath 1.0, those queries will be
> accepted and parsed, but they will not mean the same thing. The function
> will not be able to tell when it is being called with XQuery semantics
> intended, vs. when it is being called with XPath 1.0 semantics intended.
>

If we have a library with XQuery, then we can change the behave. But world
accepting of XQuery is not wide, unfortunately.

When I wrote and tested XMLTABLE, I found only few examples of where XQuery
was used. So first there should be any library with XQUery implementation
that can be used in Postgres. This library should be fast, well tested
without memory leaks. Elsewhere discussion is premature. I am not terrible
happy from libxml2 design, documentation, manuals - and I will not against
we can migrate to some better. On second hand - libxml2 code is working -
and it is widely used. It can be big mistake if we use Java library and if
we create dependency on Java. After, there are only few libs that doesn't
significantly better than libxml2.



> Now, perhaps there is a nicer way than renaming the function. It could
> work like overloading. Create two trivial domains over text, say xpath1
> and xquery, and have two XMLTABLE functions with different first parameter
> types. Then if you called with the expression '"cat" < "dog"'::xquery
> you would get the correct result 't', and with '"cat" < "dog"'::xpath1
> you would get the (also correct) result 'f'.
>

Probably it can works, but needs more work on Postgres infrastructure. If
you overload functions like this, then type should be used every time.


> (It would not be exactly overloading, because of the special sugared
> syntax known to the parser, but it could look like overloading, and be
> intuitive to the user.)
>
> If you have convenient access to Oracle to check compatibility, could you
> compare this query?
>
> SELECT * FROM XMLTABLE('.'
> PASSING ''
> COLUMNS
> a boolean PATH 'string("cat" < "dog")',
> b boolean PATH 'string("cat" > "dog")',
> c boolean PATH 'string(sale/@taxable = false())');
>
> (I suspect in Oracle it would also work without the string() wrappings,
> but just to make it easy, I think this way it will work in both Oracle
> and PG—that is, not error, though results may differ.)
>

I will test it.

Regards

Pavel


> -Chap
>
>
> [1]
>
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
>

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Chapman Flack

On 10/25/18 23:16, Pavel Stehule wrote:
>> XMLTABLE would be the headache. Using the standard name for something
>> that ain't the standard function has not left any painless way that the
>> standard function could be added. OTOH, it has only been in the wild
>> since 10, so renaming it to something else (xpath_table?) will probably
>> be more painless if done soon than it ever would be later.
>>
> I don't share your opinion. XMLTABLE implements subset of standard. More it
> is well compatible with Oracle (in this subset).
> 
> If we have library with XPath 2.0 or higher, we can continue with it.

The difficulty here is that the expression language required by the standard
is XQuery, and an XPath expression (whether 1.0 or 2.0+) can always be
parsed as an XQuery expression. (So, /syntactically/, yes, "subset".)

For XPath 2.0, that is no problem, because an XPath 2.0 expression and
the identically-spelled XQuery expression /mean the same thing/.

For XPath 1.0, it is very definitely a problem, because an XPath 1.0
expression and the identically-spelled XQuery expression /do not mean
the same thing/. Some of the important semantic differences are in [1].

So, if a future PostgreSQL version has an XMLTABLE function that accepts
XQuery, as the standard requires, and existing users upgrade and they have
XMLTABLE query expressions written as XPath 1.0, those queries will be
accepted and parsed, but they will not mean the same thing. The function
will not be able to tell when it is being called with XQuery semantics
intended, vs. when it is being called with XPath 1.0 semantics intended.

Now, perhaps there is a nicer way than renaming the function. It could
work like overloading. Create two trivial domains over text, say xpath1
and xquery, and have two XMLTABLE functions with different first parameter
types. Then if you called with the expression '"cat" < "dog"'::xquery
you would get the correct result 't', and with '"cat" < "dog"'::xpath1
you would get the (also correct) result 'f'.

(It would not be exactly overloading, because of the special sugared
syntax known to the parser, but it could look like overloading, and be
intuitive to the user.)

If you have convenient access to Oracle to check compatibility, could you
compare this query?

SELECT * FROM XMLTABLE('.'
PASSING ''
COLUMNS
a boolean PATH 'string("cat" < "dog")',
b boolean PATH 'string("cat" > "dog")',
c boolean PATH 'string(sale/@taxable = false())');

(I suspect in Oracle it would also work without the string() wrappings,
but just to make it easy, I think this way it will work in both Oracle
and PG—that is, not error, though results may differ.)

-Chap

[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Chapman Flack

On 10/25/18 09:56, Alvaro Herrera wrote:
> Would you review Markus Winand patch here?
> https://postgr.es/m/8bdb0627-2105-4564-aa76-7849f028b...@winand.at
> I think doing that would probably point out a couple of ways in which
> our XMLTABLE implementation is non-conformant, and then fixes it :-)
> I've been unsure as to applying it to all branches since 10 or just to
> master.

Well, modulo the key observation that it simply *is not* conformant
until it accepts XML Query expressions and uses the XPath 2.0 type system
and data model and the SQL/XML 2006+ casting rules ...

... and there is no ISO standard that says anything about how an XPath 1.0-
based quasi-XMLTABLE-ish function ought to behave, so it's hard to say
that anything this function does is right or wrong, per ISO ...

I think all of the changes in these patches do make it a more useful
quasi-XMLTABLE-ish function, as the pre-patch behaviors were less useful
(if not outright bewildering). And they produce output that better matches
what the XQuery-based ISO rules produce (for the subset of queries that
mean the same thing as XQuery and as XPath 1.0).

I also looked at the (not yet applied?)
XML-XPath-comments-processing-instructions-array-ind patch and I think
it, too, makes the behavior more useful. I did not actually take the
time to build a PostgreSQL with the patch, but I took the two added
regression queries, syntax-desugared them[1] and called the Saxon XQuery-
based "xmltable" example with them, and got the same expected results:

SELECT xmltable.* FROM
(SELECT 'a1aa2a 
xxx'::xml AS ".") AS p,
"xmltable"('/root', PASSING => p, COLUMNS => ARRAY[
'string(element)', 'string(element/comment()[2])',
'string(element/processing-instruction())', 'string(element/text()[1])',
'serialize(element)'])
AS (element text, cmnt text, pi text, t1 text, x text);

   element| cmnt | pi | t1  |
 x
--+--++-+-
 a1aa2a   xxx | z| pi | a1a | a1aa2a   xxx
(1 row)

SELECT xmltable.* FROM
(SELECT 'a1aa2a 
xxx'::xml AS ".") AS p,
"xmltable"('/root', PASSING => p, COLUMNS => ARRAY[
'string(element/text())', 'string(element/comment()[2])',
'string(element/processing-instruction())', 'string(element/text()[1])',
'serialize(element)'])
AS (element text, cmnt text, pi text, t1 text, x text);

ERROR:  java.sql.SQLException: A sequence of more than one item is not
allowed as the first argument of fn:string() (text("a1a"), text("a2a"))

Agreement. Agreement is good. :)

So I think they are worth applying. I can't bring myself to a strong
opinion on whether they are or aren't worth backpatching; if it were
the function described by the standard, they'd be bugs and they would
be, but does making a non-standard function behave slightly more like
the standard function that it isn't count as a bug fix or an enhancement?

My overall feeling, at least in directing my own effort, is that I'd rather
spend time toward getting the real XQuery-based semantics in place somehow,
and ongoing enhancements to the XPath-1.0-based stuff feel more like
pouring treasure down a hole.

But these enhancements seem like good ones, and if there's interest in
patching a couple more, the "unexpected XPath object type {2,3}" in [2]
might be good candidates. That would be backpatchable, as the current
behavior clearly isn't useful.

One other thing: I think the commit message on the context-item patch
is really somewhat misleading: "According to the SQL standard, the context
of XMLTABLE's XPath row_expression is the document node of the XML input
document..." Really the standard says nothing of the sort. It is a
limitation of XPath 1.0 that the input even has to be a document at all.
In the real XMLTABLE, you could be passing a context item that is a
document node (of either 'document' or 'content' flavor), or a one-item
'sequence' holding an atomic type like a number or date, or even a naked
XML node like a PI or comment or attribute node you're more used to seeing
only inside a document. The real rule is just that the context item is
exactly the thing you passed (or a copy of it, when the rules say so).
It collapses in the XPath 1.0 case to having to be a document node, simply
because that's the only thing you can pass in.

What was wrong with the pre-patch code was that it wasn't just using
the 'doc' it was given, but actually calling xmlDocGetRootElement() on it
and setting the CI to one of its children. The patch just directly
assigns doc to xpathctx->node, which I would call correct, not because
it's a document node, but because it's the thing that was passed.

-Chap

[1] You might notice in addition to desugaring the XMLTABLE syntax, I
wrapped the text-returning column paths in string(), and wrapped the
xml-returning one in serialize() and changed its result column to text.
Those changes are just to work around the parts of the Sax

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Pavel Stehule

>
> XMLTABLE would be the headache. Using the standard name for something
> that ain't the standard function has not left any painless way that the
> standard function could be added. OTOH, it has only been in the wild
> since 10, so renaming it to something else (xpath_table?) will probably
> be more painless if done soon than it ever would be later.
>
>
I don't share your opinion. XMLTABLE implements subset of standard. More it
is well compatible with Oracle (in this subset).

If we have library with XPath 2.0 or higher, we can continue with it.

Regards

Pavel

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Chapman Flack

On 10/25/18 11:15, Pavel Stehule wrote:
> čt 25. 10. 2018 v 17:09 odesílatel Alvaro Herrera 
> napsal:
>> Maybe the best way forward is to implement all the JSON functionality
>> and remove the SQL/XML bits.
> 
> It can be bigger compatibility break in Postgres history. SQL/XML functions
> are widely used.

It seems to me that evolution to the 2006+ standard version could be done
mostly non-disruptively (provided an agreeable library can be found).

I think Tom's suggestion[1] to just make XML OPTION CONTENT mean what
it means in 2006+ would be an easy change to make immediately, and would
not disrupt anybody ... it would only make some things succeed that now
fail, and it would match what our documentation already says. It would
make our XML type equivalent to 2006+ XML(CONTENT(ANY)).

Beyond that, further steps toward 2006+ could largely avoid disruption.

If we implement the typmod'ed XML types, surely the parser would simply
treat untypmod'ed 'XML' as meaning XML(CONTENT(ANY)). (The standard does
allow for the typmod to be missing, and leaves it "implementation-defined
whether SEQUENCE, CONTENT(ANY), or CONTENT(UNTYPED) is implicit", so
that's all by the book.)

The existing functions xpath and xpath_exists can be kept unchanged,
as their names are distinct from anything in the standard. A library
that supports XQuery is likely also to support XPath in "1.0 compatibility
mode", so those functions could keep their semantics.

The current xmlvalidate() has the wrong semantics and return type, but it
also does nothing but ereport unimplemented, so no current uses would be
hurt by redefining it.

XMLTABLE would be the headache. Using the standard name for something
that ain't the standard function has not left any painless way that the
standard function could be added. OTOH, it has only been in the wild
since 10, so renaming it to something else (xpath_table?) will probably
be more painless if done soon than it ever would be later.

On 10/25/18 11:23, Andreas Karlsson wrote:
> The libraries we depend on should really either be available in the
> package repositories of the major Linux distribution or be something
> we can put in our own repository and maintain without too much pain.
> So using Saxon/C does not seem like a realistic option.

That makes good sense. The approach I proposed in [2] would be to
target the XQC API as an integration point. If there is one library
that might be most acceptable (it seems xqilla is in several repositories),
it could become a preferred or supported choice, but others could be
available if an administrator wanted to separately obtain them, perhaps
because of better performance on a particular workload, or avoidance of
some bug that a given workload turns up.

-Chap

[1]: https://www.postgresql.org/message-id/22271.1540458133%40sss.pgh.pa.us
[2]:
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#One_proposal

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Alvaro Herrera

On 2018-Oct-25, Pavel Stehule wrote:

> I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> more examples and test cases.

Please see Markus Winand's patch that I referenced upthread.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Pavel Stehule

Hi


> But a roadmap that could lead to eventual availability of one of the
> C/C++ implementations would be nice too.
>

Somebody should to do some work and write patch :/. Although libxml2 is
after feature freeze - it is code widely used. The change of XML support
should be safe, because there can be lot of work.

I am thinking so I can fix some issues related to XMLTABLE. Please, send me
more examples and test cases.

Regards

Pavel

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Jesper Pedersen


On 10/25/18 2:33 PM, Andrew Dunstan wrote:


Yeah, very good point. xqilla/xerces-C appears to be widely available 
(Centos and ubuntu, at least).




xqilla/xerces-c are in the Fedora/RHEL repo too.

Best regards,
 Jesper

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Chapman Flack

On 10/25/18 11:08 AM, Alvaro Herrera wrote:

> XQilla seems to depend on Xerces, and seems to have died in 2011.

¿Eh? The latest release, 2.3.4 [1], is dated 2018-07-03.

It looks like the latest development has been happening on the
xquilla_2_3 branch. Sometimes project "activity" statistics rely
exclusively on the "master" branch ("xqilla" branch in this case),
and are deceptive if the project isn't being developed exclusively
by coding on master and backpatching to others.

I've noticed I'm facing the same thing in PL/Java ... plenty of
development lately, but on the REL1_5_STABLE branch. GitHub's
project statistics (and also Open Hub's) are just looking at master
and saying the project's been dead for two years. Now that 1.5.1 is
released, as soon as I get some of REL1_5_STABLE merged /up/ into
master, the statistics will probably magically show it's been alive
all along.

> Zorba appears to have been taken propietary, from the looks of its last
> commits.

It does seem harder to see what's going on there, but the commits
with "copyright changed" as the message turn out to be changing
only the copyright holder in the Apache 2.0 license from
"The FLWOR Foundation" to "zorba.io". But Matthias Brantner
participated with interest in the 2010 thread here where Zorba
was brought up before[2], so he may know something.

> Integrating a C runtime of a Java library sounds nightmarish --
> I wouldn't even think about that.

Or whether or not nightmarish, certainly duplicative of something
we can kinda already do.

In a way, some of the pressure is off, because if you need
a true XMLQUERY or XMLTABLE, you can get them with the Saxon-in-PL/Java
implementation, and right now in any supported PG version. You just
have to spell them funny, doing without the sugary syntax built into
the parser. They're missing some of the automatic casts from
the standard at the moment, which isn't really a loss of function,
as explicit casts can be added to any query needing them ... a temporary
annoyance until the rest of that example's in place.

But a roadmap that could lead to eventual availability of one of the
C/C++ implementations would be nice too.

-Chap

[1]: https://sourceforge.net/projects/xqilla/files/
[2]:
https://www.postgresql.org/message-id/7DDDB18E-041F-4238-B91D-3277EB1CE5BC%4028msec.com

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Andrew Dunstan





On 10/25/2018 11:23 AM, Andreas Karlsson wrote:

On 10/25/2018 03:53 PM, Chapman Flack wrote:

On 10/25/18 10:39 AM, Tom Lane wrote:

I think getting out from under libxml2's idiosyncrasies and security
lapses would be great, but is there a plausible alternative out there?


Depends on whether anything in [1] sounds plausible.


The libraries we depend on should really either be available in the 
package repositories of the major Linux distribution or be something 
we can put in our own repository and maintain without too much pain. 
So using Saxon/C does not seem like a realistic option.





Yeah, very good point. xqilla/xerces-C appears to be widely available 
(Centos and ubuntu, at least).


cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Andreas Karlsson


On 10/25/2018 03:53 PM, Chapman Flack wrote:

On 10/25/18 10:39 AM, Tom Lane wrote:

I think getting out from under libxml2's idiosyncrasies and security
lapses would be great, but is there a plausible alternative out there?


Depends on whether anything in [1] sounds plausible.


The libraries we depend on should really either be available in the 
package repositories of the major Linux distribution or be something we 
can put in our own repository and maintain without too much pain. So 
using Saxon/C does not seem like a realistic option.


Andreas

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Pavel Stehule

čt 25. 10. 2018 v 17:09 odesílatel Alvaro Herrera 
napsal:

> On 2018-Oct-25, Chapman Flack wrote:
>
> > On 10/25/18 10:39 AM, Tom Lane wrote:
> > > I think getting out from under libxml2's idiosyncrasies and security
> > > lapses would be great, but is there a plausible alternative out there?
> >
> > Depends on whether anything in [1] sounds plausible.
> >
> > [1]:
> >
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward
>
> Heh, I didn't notice this part of the document.  Integrating a C runtime
> of a Java library sounds nightmarish -- I wouldn't even think about
> that.
>
> XQilla seems to depend on Xerces, and seems to have died in 2011.
>
> Zorba appears to have been taken propietary, from the looks of its last
> commits.
>
> Maybe the best way forward is to implement all the JSON functionality
> and remove the SQL/XML bits.
>

It can be bigger compatibility break in Postgres history. SQL/XML functions
are widely used.

Regards

Pavel


> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Alvaro Herrera

On 2018-Oct-25, Chapman Flack wrote:

> On 10/25/18 10:39 AM, Tom Lane wrote:
> > I think getting out from under libxml2's idiosyncrasies and security
> > lapses would be great, but is there a plausible alternative out there?
> 
> Depends on whether anything in [1] sounds plausible.
> 
> [1]:
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward

Heh, I didn't notice this part of the document.  Integrating a C runtime
of a Java library sounds nightmarish -- I wouldn't even think about
that.

XQilla seems to depend on Xerces, and seems to have died in 2011.

Zorba appears to have been taken propietary, from the looks of its last
commits.

Maybe the best way forward is to implement all the JSON functionality
and remove the SQL/XML bits.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Chapman Flack

On 10/25/18 10:39 AM, Tom Lane wrote:
> I think getting out from under libxml2's idiosyncrasies and security
> lapses would be great, but is there a plausible alternative out there?

Depends on whether anything in [1] sounds plausible.

-Chap


[1]:
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Tom Lane

Alvaro Herrera  writes:
> On 2018-Oct-24, Chapman Flack wrote:
>> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards

> Wow, that's ... overwhelming.  (I do wonder if we should stop relying on
> libxml2 and instead look for something supporting XQuery).

I think getting out from under libxml2's idiosyncrasies and security
lapses would be great, but is there a plausible alternative out there?

regards, tom lane

Re: PostgreSQL vs SQL/XML Standards

2018-10-25 Thread Alvaro Herrera

On 2018-Oct-24, Chapman Flack wrote:

> Inspired by the wiki page on PostgreSQL vs SQL Standard in general,
> I have made another wiki page specifically about $subject. I hope
> this was not presumptuous, and invite review / comment. I have not
> linked to it from any other page yet.
> 
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards

Wow, that's ... overwhelming.  (I do wonder if we should stop relying on
libxml2 and instead look for something supporting XQuery).

Would you review Markus Winand patch here?
https://postgr.es/m/8bdb0627-2105-4564-aa76-7849f028b...@winand.at
I think doing that would probably point out a couple of ways in which
our XMLTABLE implementation is non-conformant, and then fixes it :-)
I've been unsure as to applying it to all branches since 10 or just to
master.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

PostgreSQL vs SQL/XML Standards

2018-10-24 Thread Chapman Flack

Inspired by the wiki page on PostgreSQL vs SQL Standard in general,
I have made another wiki page specifically about $subject. I hope
this was not presumptuous, and invite review / comment. I have not
linked to it from any other page yet.

https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards

-Chap

83 matches

Mail list logo