Hi

2018-01-23 8:13 GMT+01:00 Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp
>:

> Hello, I returned to this.
>
> I thouroughly checked the translator's behavior against the XPath
> specifications and checkd out the documentation and regression
> test. Almost everything is fine for me and this would be the last
> comment from me.
>
> At Fri, 24 Nov 2017 18:32:43 +0100, Pavel Stehule <pavel.steh...@gmail.com>
> wrote in <CAFj8pRB7Fs_2DrtUTGhTmQb+KReXPH6SG62hGWO3KVL_eZYCaA@
> mail.gmail.com>
> > 2017-11-24 18:13 GMT+01:00 Pavel Stehule <pavel.steh...@gmail.com>:
> >
> > >
> > >
> > > 2017-11-24 17:53 GMT+01:00 Pavel Stehule <pavel.steh...@gmail.com>:
> > >
> > >> Hi
> > >>
> > >> 2017-11-22 22:49 GMT+01:00 Thomas Munro <
> thomas.mu...@enterprisedb.com>:
> > >>
> > >>> On Thu, Nov 9, 2017 at 10:11 PM, Pavel Stehule <
> pavel.steh...@gmail.com>
> > >>> wrote:
> > >>> > Attached new version.
> > >>>
> > >>> Hi Pavel,
> > >>>
> > >>> FYI my patch testing robot says[1]:
> > >>>
> > >>>      xml                      ... FAILED
> > >>>
> > >>> regression.diffs says:
> > >>>
> > >>> + SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
> > >>> '/rows/row' PASSING t1.doc COLUMNS data int PATH
> > >>> 'child::a[1][attribute::hoge="haha"]') as x;
> > >>> + data
> > >>> + ------
> > >>> + (0 rows)
> > >>> +
> > >>>
> > >>> Maybe you forgot to git-add the expected file?
> > >>>
> > >>> [1] https://travis-ci.org/postgresql-cfbot/postgresql/
> builds/305979133
> > >>
> > >>
> > >> unfortunately xml.out has 3 versions and is possible so one version
> > >> should be taken elsewhere than my comp.
> > >>
> > >> please can me send your result xml.out file?
> > >>
> > >
> > > looks like this case is without xml support so I can fix on my comp.
> > >
> > >
> > fixed regress test
>
> (I wouldn't have found that..)
>
> I have three comments on the behavior and one on documentation.
>
> 1. Lack of syntax handling.
>
> ["'" [^'] "'"] is also a string literal, but getXPathToken is
> forgetting that and applying default namespace mistakenly to the
> astring content.
>

In this case, I am not sure if I understand well.

Within expressions, literal strings are delimited by single or double
quotation marks, which are also used to delimit XML attributes. To avoid a
quotation mark in an expression being interpreted by the XML processor as
terminating the attribute value the quotation mark can be entered as a
character reference (&quot; or &apos;). Alternatively, the expression can
use single quotation marks if the XML attribute is delimited with double
quotation marks or vice-versa.

So if I understand well, then XML string can looks like ' some " some ' or
" some ' some ". I fixed it.


> 2. Additional comment might be good.
>
> It might be better having additional description about default
> namespace in the comment starts from "Namespace mappings are
> passed as text[]" in xpth_internal().
>

fixed


> 3. Inconsistent behavior from named namespace.
>
> | -     function context, aliases are <emphasis>local</emphasis>).
> | +     function context, aliases are <emphasis>local</emphasis>). Default
> namespace has
> | +     empty name (empty string) and should be only one.
>
> This works as the description, on the other hand the same
> namespace prefix can be defined twice or more in the array and
> the last one is in effect. I don't see a reason for
> differenciating the default namespace case.
>

It looks like libxml2 bug. There is no sense to use more than one default
namespace, and although it is inconsistent with other namespaces, I am
thinking so it is correct. Is better to raise error early. In this case
Postgres expects so libxml2 ensure all namespace checks and it is tolerant.
Default namespace is implemented inside Postgres, and I don't see any
advantage of tolerant behave. More default namespaces is disallowed for
XMLTABLE - so some inconsistency there should be. In this case I prefer
raise error to signalize ambiguous or badly formatted input clearly.


>
>
> 4. Comments on the documentation part.
>
> # Even though I'm not sutable for commenting on wording...
>
> | +     Inside predicate literals <literal>and</literal>,
> <literal>or</literal>,
> | +     <literal>div</literal> and <literal>mod</literal> are used as
> keywords
> | +     (XPath operators) every time and default namespace are not applied
> there.
>
> *I*'d like to have a comma between the predicate and literals,
> and have a 'a' before prediate. Or 'Literals .. inside a
> predicate' might be better?
>

fixed

>
> 'are used as keywords' might be better being 'are identifed as
> keywords'?
>

fixed


> Default namespace is applied to tag names except the listed
> keywords even inside a predicate. So 'are not applied there'
> might be better being 'are not applied to them'?  Or 'are not
> applied in the case'?
>
> | +     If you would to use these literals like tag names, then the
> default namespace
> | +     should not be used, and these literals should be explicitly
> | +     labeled.
> | +    </para>
>
> Default namespace is not applied *only to* such keywords inside a
> predicate. Even if an Xpath expression contains such a tag name,
> default namespace still works for other tags. Does the following
> make sense?
>
> + Use named namespace to qualify such tag names appear in an
> + XPath predicate.
>
>
fixed

I hope so some native speaker finalize doc. It is out of my knowledges
.

>
>
> ===
> After the aboves are addressed (even or rejected), I think I
> don't have no additional comment.
>
>
> - This current patch applies safely (with small shifts) on the
>   current master.
>
> - The code looks fine for me.
>
> - This patch translates the given XPath expression by prefixing
>   unprefixed tag names with a special namespace prefix only in
>   the case where default namespace is defined, so the existing
>   behavior is not affected.
>
> - The syntax is existing but just not implemented so I don't
>   think no arguemnts needed here.
>
> - It undocumentedly inhibits the usage of the namespace prefix
>   "pgdefnamespace.pgsqlxml.internal" but I believe no one can
>   notice that.
>
> - The default-ns translator (xpath_parser.c) seems working
>   perfectly with some harmless exceptions.
>
>   (xpath specifications is here: https://www.w3.org/TR/1999/
> REC-xpath-19991116/)
>
>   Related unused features (and not documented?):
>     context variables ($n notations),
>     user-defined functions (or function names prefixed by a namespace
> prefix)


I did some fast check - and these features are not supported by libxml2 -
so it is question if it should be parsed by out xpath parser. So there are
not possible to check it, test it :(


>
>   Newly documented behavior:
>     the default namespace isn't applied to and/or/div/mod.
>
> - Dodumentation looks enough.
>
> - Regression test doesn't cover the XPath syntax but it's not
>   viable.  I am fine with the basic test cases added by the
>   current patch.
>
> regards,
>

I am sending updated version.

Very much thanks for very precious review

Pavel


> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 487c7ff750..3c410c5ac2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10497,7 +10497,8 @@ SELECT xml_is_well_formed_document('<pg:foo xmlns:pg="http://postgresql.org/stuf
      second the namespace URI. It is not required that aliases provided in
      this array be the same as those being used in the XML document itself (in
      other words, both in the XML document and in the <function>xpath</function>
-     function context, aliases are <emphasis>local</emphasis>).
+     function context, aliases are <emphasis>local</emphasis>). Default namespace has
+     empty name (empty string) and should be only one.
     </para>
 
     <para>
@@ -10513,11 +10514,18 @@ SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com";>test</my:a>',
 ]]></screen>
     </para>
 
+    <para>
+     Inside a predicate, literals <literal>and</literal>, <literal>or</literal>,
+     <literal>div</literal> and <literal>mod</literal> are identified as keywords
+     (XPath operators) every time and default namespace are not applied in the case.
+     Use named namespace to qualify such tag names appear in an XPath predicate.
+    </para>
+
     <para>
      To deal with default (anonymous) namespaces, do something like this:
 <screen><![CDATA[
-SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com";><b>test</b></a>',
-             ARRAY[ARRAY['mydefns', 'http://example.com']]);
+SELECT xpath('//b/text()', '<a xmlns="http://example.com";><b>test</b></a>',
+             ARRAY[ARRAY['', 'http://example.com']]);
 
  xpath
 --------
@@ -10591,8 +10599,7 @@ SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com";>test</m
     <para>
      The optional <literal>XMLNAMESPACES</literal> clause is a comma-separated
      list of namespaces.  It specifies the XML namespaces used in
-     the document and their aliases. A default namespace specification
-     is not currently supported.
+     the document and their aliases.
     </para>
 
     <para>
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 1fb018416e..b60a3cfe0d 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
 	tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
 	tsvector.o tsvector_op.o tsvector_parser.o \
 	txid.o uuid.o varbit.o varchar.o varlena.o version.o \
-	windowfuncs.o xid.o xml.o
+	windowfuncs.o xid.o xml.o xpath_parser.o
 
 like.o: like.c like_match.c
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 7cdb87ef85..7bcffdc442 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -90,7 +90,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/xml.h"
-
+#include "utils/xpath_parser.h"
 
 /* GUC variables */
 int			xmlbinary;
@@ -187,6 +187,7 @@ typedef struct XmlTableBuilderData
 	xmlXPathCompExprPtr xpathcomp;
 	xmlXPathObjectPtr xpathobj;
 	xmlXPathCompExprPtr *xpathscomp;
+	bool		with_default_ns;
 } XmlTableBuilderData;
 #endif
 
@@ -227,6 +228,7 @@ const TableFuncRoutine XmlTableRoutine =
 #define NAMESPACE_XSI "http://www.w3.org/2001/XMLSchema-instance";
 #define NAMESPACE_SQLXML "http://standards.iso.org/iso/9075/2003/sqlxml";
 
+#define DEFAULT_NAMESPACE_NAME		"pgdefnamespace.pgsqlxml.internal"
 
 #ifdef USE_LIBXML
 
@@ -3850,6 +3852,7 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 	int			ndim;
 	Datum	   *ns_names_uris;
 	bool	   *ns_names_uris_nulls;
+	bool		with_default_ns = false;
 	int			ns_count;
 
 	/*
@@ -3860,6 +3863,8 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 	 * first element defining the name, the second one the URI.  Example:
 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
 	 * 'http://example2.com']].
+	 * When the name is empty string, then URI is used as default namespace.
+	 * Example: ARRAY[ARRAY['', 'http://x.y]]
 	 */
 	ndim = namespaces ? ARR_NDIM(namespaces) : 0;
 	if (ndim != 0)
@@ -3899,7 +3904,6 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 				 errmsg("empty XPath expression")));
 
 	string = pg_xmlCharStrndup(datastr, len);
-	xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
 
 	/*
 	 * In a UTF8 database, skip any xml declaration, which might assert
@@ -3953,6 +3957,26 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 							(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
 							 errmsg("neither namespace name nor URI may be null")));
 				ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+
+				/* Don't allow same namespace as out internal default namespace name */
+				if (strcmp(ns_name, DEFAULT_NAMESPACE_NAME) == 0)
+					ereport(ERROR,
+								(errcode(ERRCODE_RESERVED_NAME),
+								 errmsg("cannot to use \"%s\" as namespace name",
+										  DEFAULT_NAMESPACE_NAME),
+								 errdetail("\"%s\" is reserved for internal purpose",
+										  DEFAULT_NAMESPACE_NAME)));
+				if (*ns_name == '\0')
+				{
+					if (with_default_ns)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("only one default namespace is allowed")));
+
+					with_default_ns = true;
+					ns_name = DEFAULT_NAMESPACE_NAME;
+				}
+
 				ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
 				if (xmlXPathRegisterNs(xpathctx,
 									   (xmlChar *) ns_name,
@@ -3963,6 +3987,16 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 			}
 		}
 
+		if (with_default_ns)
+		{
+			StringInfoData		str;
+
+			transformXPath(&str, text_to_cstring(xpath_expr_text), DEFAULT_NAMESPACE_NAME);
+			xpath_expr = pg_xmlCharStrndup(str.data, str.len);
+		}
+		else
+			xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
+
 		xpathcomp = xmlXPathCompile(xpath_expr);
 		if (xpathcomp == NULL || xmlerrcxt->err_occurred)
 			xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
@@ -4207,6 +4241,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts)
 	xtCxt->magic = XMLTABLE_CONTEXT_MAGIC;
 	xtCxt->natts = natts;
 	xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts);
+	xtCxt->with_default_ns = false;
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -4299,6 +4334,7 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value)
 #endif							/* not USE_LIBXML */
 }
 
+
 /*
  * XmlTableSetNamespace
  *		Add a namespace declaration
@@ -4309,12 +4345,25 @@ XmlTableSetNamespace(TableFuncScanState *state, const char *name, const char *ur
 #ifdef USE_LIBXML
 	XmlTableBuilderData *xtCxt;
 
-	if (name == NULL)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DEFAULT namespace is not supported")));
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace");
 
+	if (name != NULL)
+	{
+		/* Don't allow same namespace as out internal default namespace name */
+		if (strcmp(name, DEFAULT_NAMESPACE_NAME) == 0)
+			ereport(ERROR,
+						(errcode(ERRCODE_RESERVED_NAME),
+						 errmsg("cannot to use \"%s\" as namespace name",
+								  DEFAULT_NAMESPACE_NAME),
+						 errdetail("\"%s\" is reserved for internal purpose",
+								  DEFAULT_NAMESPACE_NAME)));
+	}
+	else
+	{
+		xtCxt->with_default_ns = true;
+		name = DEFAULT_NAMESPACE_NAME;
+	}
+
 	if (xmlXPathRegisterNs(xtCxt->xpathcxt,
 						   pg_xmlCharStrndup(name, strlen(name)),
 						   pg_xmlCharStrndup(uri, strlen(uri))))
@@ -4343,6 +4392,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, const char *path)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("row path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathcomp = xmlXPathCompile(xstr);
@@ -4374,6 +4431,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, const char *path, int colnum)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("column path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr);
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
new file mode 100644
index 0000000000..dff9fa60a8
--- /dev/null
+++ b/src/backend/utils/adt/xpath_parser.c
@@ -0,0 +1,369 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.c
+ *	  XML XPath parser.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/backend/utils/adt/xpath_parser.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "utils/xpath_parser.h"
+
+/*
+ * All PostgreSQL XML related functionality is based on libxml2 library, and
+ * XPath support is not an exception.  However, libxml2 doesn't support
+ * default namespace for XPath expressions. Because there are not any API
+ * how to transform or access to parsed XPath expression we have to parse
+ * XPath here.
+ *
+ * Those functionalities are implemented with a simple XPath parser/
+ * preprocessor.  This XPath parser transforms a XPath expression to another
+ * XPath expression that can be used by libxml2 XPath evaluation. It doesn't
+ * replace libxml2 XPath parser or libxml2 XPath expression evaluation.
+ */
+
+#ifdef USE_LIBXML
+
+/*
+ * We need to work with XPath expression tokens.  When expression starting with
+ * nodename, then we can use prefix.  When default namespace is defined, then we
+ * should to enhance any nodename and attribute without namespace by default
+ * namespace.
+ */
+
+typedef enum
+{
+	XPATH_TOKEN_NONE,
+	XPATH_TOKEN_NAME,
+	XPATH_TOKEN_STRING,
+	XPATH_TOKEN_NUMBER,
+	XPATH_TOKEN_COLON,
+	XPATH_TOKEN_DCOLON,
+	XPATH_TOKEN_OTHER
+}	XPathTokenType;
+
+typedef struct XPathTokenInfo
+{
+	XPathTokenType ttype;
+	const char	   *start;
+	int			length;
+}	XPathTokenInfo;
+
+typedef struct ParserData
+{
+	const char	   *str;
+	const char	   *cur;
+	XPathTokenInfo buffer;
+	bool		buffer_is_empty;
+}	XPathParserData;
+
+/* Any high-bit-set character is OK (might be part of a multibyte char) */
+#define IS_NODENAME_FIRSTCHAR(c)	 ((c) == '_' || \
+								 ((c) >= 'A' && (c) <= 'Z') || \
+								 ((c) >= 'a' && (c) <= 'z') || \
+								 (IS_HIGHBIT_SET(c)))
+
+#define IS_NODENAME_CHAR(c)		(IS_NODENAME_FIRSTCHAR(c) || (c) == '-' || (c) == '.' || \
+								 ((c) >= '0' && (c) <= '9'))
+
+#define TOKEN_IS_EMPTY(t)		((t).ttype == XPATH_TOKEN_NONE)
+
+/*
+ * Returns next char after last char of token - XPath lexer
+ */
+static const char *
+getXPathToken(const char *str, XPathTokenInfo * ti)
+{
+	/* skip initial spaces */
+	while (*str == ' ')
+		str++;
+
+	if (*str != '\0')
+	{
+		char		c = *str;
+
+		ti->start = str++;
+
+		if (c >= '0' && c <= '9')
+		{
+			while (*str >= '0' && *str <= '9')
+				str++;
+			if (*str == '.')
+			{
+				str++;
+				while (*str >= '0' && *str <= '9')
+					str++;
+			}
+			ti->ttype = XPATH_TOKEN_NUMBER;
+		}
+		else if (IS_NODENAME_FIRSTCHAR(c))
+		{
+			while (IS_NODENAME_CHAR(*str))
+				str++;
+
+			ti->ttype = XPATH_TOKEN_NAME;
+		}
+		else if (c == '"')
+		{
+			while (*str != '\0')
+				if (*str++ == '"')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else if (c == '\'')
+		{
+			while (*str != '\0')
+				if (*str++ == '\'')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else if (c == ':')
+		{
+			/* look ahead to detect a double-colon */
+			if (*str == ':')
+			{
+				ti->ttype = XPATH_TOKEN_DCOLON;
+				str++;
+			}
+			else
+				ti->ttype = XPATH_TOKEN_COLON;
+		}
+		else
+			ti->ttype = XPATH_TOKEN_OTHER;
+
+		ti->length = str - ti->start;
+	}
+	else
+	{
+		ti->start = NULL;
+		ti->length = 0;
+
+		ti->ttype = XPATH_TOKEN_NONE;
+	}
+
+	return str;
+}
+
+/*
+ * reset XPath parser stack
+ */
+static void
+initXPathParser(XPathParserData * parser, const char *str)
+{
+	parser->str = str;
+	parser->cur = str;
+	parser->buffer_is_empty = true;
+}
+
+/*
+ * Returns token from stack or read token
+ */
+static void
+nextXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+	{
+		memcpy(ti, &parser->buffer, sizeof(XPathTokenInfo));
+		parser->buffer_is_empty = true;
+	}
+	else
+		parser->cur = getXPathToken(parser->cur, ti);
+}
+
+/*
+ * Push token to stack
+ */
+static void
+pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+		elog(ERROR, "internal error");
+
+	memcpy(&parser->buffer, ti, sizeof(XPathTokenInfo));
+	parser->buffer_is_empty = false;
+	ti->ttype = XPATH_TOKEN_NONE;
+}
+
+/*
+ * Write token to output string
+ */
+static void
+writeXPathToken(StringInfo str, XPathTokenInfo * ti)
+{
+	Assert(ti->ttype != XPATH_TOKEN_NONE);
+
+	if (ti->ttype != XPATH_TOKEN_OTHER)
+		appendBinaryStringInfo(str, ti->start, ti->length);
+	else
+		appendStringInfoChar(str, *ti->start);
+
+	ti->ttype = XPATH_TOKEN_NONE;
+}
+
+/*
+ * This is main part of XPath transformation. It can be called recursivly,
+ * when XPath expression contains predicates.
+ */
+static void
+_transformXPath(StringInfo str, XPathParserData * parser,
+				bool inside_predicate,
+				char *def_namespace_name)
+{
+	XPathTokenInfo t1,
+				t2;
+	bool		tagname_needs_defnsp;
+	bool		token_is_tagattrib = false;
+
+	nextXPathToken(parser, &t1);
+
+	while (t1.ttype != XPATH_TOKEN_NONE)
+	{
+		switch (t1.ttype)
+		{
+			case XPATH_TOKEN_NUMBER:
+			case XPATH_TOKEN_STRING:
+			case XPATH_TOKEN_COLON:
+			case XPATH_TOKEN_DCOLON:
+				/* write without any changes */
+				writeXPathToken(str, &t1);
+				/* process fresh token */
+				nextXPathToken(parser, &t1);
+				break;
+
+			case XPATH_TOKEN_NAME:
+				{
+					/*
+					 * Inside predicate ignore keywords (literal operators)
+					 * "and" "or" "div" and "mod".
+					 */
+					if (inside_predicate)
+					{
+						if ((strncmp(t1.start, "and", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "or", 2) == 0 && t1.length == 2) ||
+						 (strncmp(t1.start, "div", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "mod", 3) == 0 && t1.length == 3))
+						{
+							token_is_tagattrib = false;
+
+							/* keyword */
+							writeXPathToken(str, &t1);
+							/* process fresh token */
+							nextXPathToken(parser, &t1);
+							break;
+						}
+					}
+
+					tagname_needs_defnsp = true;
+
+					nextXPathToken(parser, &t2);
+					if (t2.ttype == XPATH_TOKEN_COLON)
+					{
+						/* t1 is a quilified node name. no need to add default one. */
+						tagname_needs_defnsp = false;
+
+						/* namespace name */
+						writeXPathToken(str, &t1);
+						/* colon */
+						writeXPathToken(str, &t2);
+						/* get node name */
+						nextXPathToken(parser, &t1);
+					}
+					else if (t2.ttype == XPATH_TOKEN_DCOLON)
+					{
+						/* t1 is an axis name. write out as it is */
+						if (strncmp(t1.start, "attribute", 9) == 0 && t1.length == 9)
+							token_is_tagattrib = true;
+
+						/* axis name */
+						writeXPathToken(str, &t1);
+						/* double colon */
+						writeXPathToken(str, &t2);
+
+						/*
+						 * The next token may be qualified tag name, process
+						 * it as a fresh token.
+						 */
+						nextXPathToken(parser, &t1);
+						break;
+					}
+					else if (t2.ttype == XPATH_TOKEN_OTHER)
+					{
+						/* function name doesn't require namespace */
+						if (*t2.start == '(')
+							tagname_needs_defnsp = false;
+						else
+							pushXPathToken(parser, &t2);
+					}
+
+					if (tagname_needs_defnsp && !token_is_tagattrib)
+						appendStringInfo(str, "%s:", def_namespace_name);
+
+					token_is_tagattrib = false;
+
+					/* write maybe-tagname if not consumed yet */
+					if (!TOKEN_IS_EMPTY(t1))
+						writeXPathToken(str, &t1);
+
+					/* output t2 if not consumed yet */
+					if (!TOKEN_IS_EMPTY(t2))
+						writeXPathToken(str, &t2);
+
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_OTHER:
+				{
+					char		c = *t1.start;
+
+					writeXPathToken(str, &t1);
+
+					if (c == '[')
+						_transformXPath(str, parser, true, def_namespace_name);
+					else
+					{
+						if (c == ']' && inside_predicate)
+						{
+							return;
+						}
+						else if (c == '@')
+						{
+							nextXPathToken(parser, &t1);
+							if (t1.ttype == XPATH_TOKEN_NAME)
+								token_is_tagattrib = true;
+
+							pushXPathToken(parser, &t1);
+						}
+					}
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_NONE:
+				elog(ERROR, "should not be here");
+		}
+	}
+}
+
+void
+transformXPath(StringInfo str, const char *xpath,
+			   char *def_namespace_name)
+{
+	XPathParserData parser;
+
+	Assert(def_namespace_name != NULL);
+
+	initStringInfo(str);
+	initXPathParser(&parser, xpath);
+	_transformXPath(str, &parser, false, def_namespace_name);
+
+	elog(DEBUG1, "apply default namespace \"%s\"", str->data);
+}
+
+#endif
diff --git a/src/include/utils/xpath_parser.h b/src/include/utils/xpath_parser.h
new file mode 100644
index 0000000000..57d37df312
--- /dev/null
+++ b/src/include/utils/xpath_parser.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.h
+ *	  Declarations for XML XPath transformation.
+ *
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/xml.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef XPATH_PARSER_H
+#define XPATH_PARSER_H
+
+#include "postgres.h"
+#include "lib/stringinfo.h"
+
+void transformXPath(StringInfo str, const char *xpath, char *def_namespace_name);
+
+#endif   /* XPATH_PARSER_H */
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 7fa1309108..fe8374d14b 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1120,7 +1120,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y";><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1487,3 +1491,69 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- both string separators are supported
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge="haha"]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge=''haha'']') AS x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\"; hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\"; hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index 970ab26fce..3aa32d51d5 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -1337,3 +1337,75 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
 ---
 (0 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>');
+ERROR:  unsupported XML feature
+LINE 1: INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y";><row><a ...
+                                  ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+(0 rows)
+
+-- both string separators are supported
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge="haha"]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge=''haha'']') AS x;
+ data 
+------
+(0 rows)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="ht...
+                                                    ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x...
+                                              ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: ...ELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xml...
+                                                             ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index 112ebe47cd..58b46bcc24 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -1100,7 +1100,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y";><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1467,3 +1471,69 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- both string separators are supported
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge="haha"]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge=''haha'']') AS x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\"; hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\"; hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index cb96e18005..55b49d6dba 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -594,3 +594,27 @@ INSERT INTO xmltest2 VALUES('<d><r><dc>2</dc></r></d>', 'D');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
+
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>');
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+
+-- both string separators are supported
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge="haha"]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge=''haha'']') AS x;
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y";><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);

Reply via email to