Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-08-08 Thread Tom Lane
Mike Fowler m...@mlfowler.com writes:
 On 06/08/10 20:55, Peter Eisentraut wrote:
 On fre, 2010-08-06 at 09:04 +0100, Mike Fowler wrote:
 If the patch is to be committed, does it make sense for me to refine
 it such that it uses the new xpath internal function you extracted in
 the xmlexists patch?
 
 Yes, you can probably shrink this patch down to about 20 lines.

 Updated the patch so that it will apply to head and re-worked the 
 function to use the new xpath internal function.

Applied with minor corrections (improved docs, fixed regression tests,
adjusted OIDs for CVS HEAD).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-08-07 Thread Mike Fowler

On 06/08/10 20:55, Peter Eisentraut wrote:

On fre, 2010-08-06 at 09:04 +0100, Mike Fowler wrote:

If the patch is to be committed, does it make sense for me to refine
it such that it uses the new xpath internal function you extracted in
the xmlexists patch?


Yes, you can probably shrink this patch down to about 20 lines.



Updated the patch so that it will apply to head and re-worked the 
function to use the new xpath internal function.


Regards,

--
Mike Fowler
Registered Linux user: 379787
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8693,8698  SELECT xpath('//mydefns:b/text()', 'a xmlns=http://example.com;btest/b/a
--- 8693,8731 
  (1 row)
  ]]/screen
 /para
+ 
+sect3
+ titlexpath_exists/title
+ 
+ indexterm
+  primaryxpath_exists/primary
+ /indexterm
+ 
+ synopsis
+  functionxpath_exists/function(replaceablexpath/replaceable, replaceablexml/replaceableoptional, replaceablensarray/replaceable/optional)
+ /synopsis
+ 
+ para
+  The function functionxpath_exists/function is a specialised form
+  of the functionxpath/function function. Though the functions are
+  syntactically the same the xpath expressions are evaluated in differing
+  contexts. Instead of returning the XML values that satisfy the xpath, this
+  function returns a boolean indicating whether the query was satisfied or not.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xpath_exists('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', 
+  ARRAY[ARRAY['my', 'http://example.com']]);
+ 
+  xpath_exists  
+ 
+  t
+ (1 row)
+ ]]/screen
+  /para
+ /sect3
/sect2
  
sect2 id=functions-xml-mapping
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3541,3543  Datum xmlexists(PG_FUNCTION_ARGS)
--- 3541,3567 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean. Differs from
+  * xmlexists as it supports namespaces and is not defined in SQL/XML.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	int			res_nitems;
+ 
+ 	xpath_internal(xpath_expr_text, data, namespaces,
+    res_nitems, NULL);
+ 
+ 	PG_RETURN_BOOL(res_nitems  0);
+ #else
+ 	NO_XML_SUPPORT();
+ 	return 0;
+ #endif
+ }
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***
*** 4390,4395  DESCR(evaluate XPath expression);
--- 4390,4400 
  DATA(insert OID = 2614 (  xmlexists	 PGNSP PGUID 12 1 0 0 f f f t f i 2 0 16 25 142 _null_ _null_ _null_ _null_ xmlexists _null_ _null_ _null_ ));
  DESCR(test XML value against XPath expression);
  
+ DATA(insert OID = 3037 (  xpath_exists	 PGNSP PGUID 12 1 0 0 f f f t f i 3 0 16 25 142 1009 _null_ _null_ _null_ _null_ xpath_exists _null_ _null_ _null_ ));
+ DESCR(evaluate XPath expression in a boolean context, with namespaces support);
+ DATA(insert OID = 3038 (  xpath_exists	 PGNSP PGUID 14 1 0 0 f f f t f i 2 0 16 25 142 _null_ _null_ _null_ _null_ select pg_catalog.xpath_exists($1, $2, ''{}''::pg_catalog.text[]) _null_ _null_ _null_ ));
+ DESCR(evaluate XPath expression in a boolean context);
+ 
  /* uuid */
  DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 0 f f f t f i 1 0 2950 2275 _null_ _null_ _null_ _null_ uuid_in _null_ _null_ _null_ ));
  DESCR(I/O);
*** a/src/include/utils/xml.h
--- b/src/include/utils/xml.h
***
*** 37,42  extern Datum texttoxml(PG_FUNCTION_ARGS);
--- 37,43 
  extern Datum xmltotext(PG_FUNCTION_ARGS);
  extern Datum xmlvalidate(PG_FUNCTION_ARGS);
  extern Datum xpath(PG_FUNCTION_ARGS);
+ extern Datum xpath_exists(PG_FUNCTION_ARGS);
  extern Datum xmlexists(PG_FUNCTION_ARGS);
  
  extern Datum table_to_xml(PG_FUNCTION_ARGS);
*** a/src/test/regress/expected/xml.out
--- b/src/test/regress/expected/xml.out
***
*** 502,507  SELECT xpath('//b', 'aone btwo/b three betc/b/a');
--- 502,560 
   {btwo/b,betc/b}
  (1 row)
  
+ -- Test xpath_exists evaluation
+ SELECT xpath_exists('//town[text() = ''Toronto'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);
+  xpath_exists 
+ --
+  f
+ (1 row)
+ 
+ SELECT xpath_exists('//town[text() = ''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);
+  xpath_exists 
+ --
+  t
+ (1 row)
+ 
+ INSERT INTO xmltest VALUES (4, 'menubeersnameBudvar/namecostfree/costnameCarling/namecostlots/cost/beers/menu'::xml);
+ INSERT INTO xmltest VALUES (5, 'menubeersnameMolson/namecostfree/costnameCarling/namecostlots/cost/beers/menu'::xml);
+ INSERT INTO xmltest VALUES (6, 'myns:menu 

Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-08-06 Thread Mike Fowler

On 06/08/10 05:38, Peter Eisentraut wrote:

On tis, 2010-07-27 at 16:33 -0700, David Fetter wrote:
   

* Do we already have it?

 Not really.  There are kludges to accomplish these things, but
 they're available mostly in the sense that a general-purpose
 language allows you to write code to do anything a Turing machine
 can do.
 

I think this has been obsoleted by the xmlexists patch


In many ways yes. The only surviving difference is that xpath_exists has 
support for namespaces and xmlexists does not as the grammar expects 
namespaces to be handled in the xquery. So if people expect namespace 
support to be useful that having both functions is useful until I (or 
someone who works faster than me) get xquery going.


If the patch is to be committed, does it make sense for me to refine it 
such that it uses the new xpath internal function you extracted in the 
xmlexists patch?


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-08-06 Thread Peter Eisentraut
On fre, 2010-08-06 at 09:04 +0100, Mike Fowler wrote:
 If the patch is to be committed, does it make sense for me to refine
 it such that it uses the new xpath internal function you extracted in
 the xmlexists patch?

Yes, you can probably shrink this patch down to about 20 lines.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-08-05 Thread Alvaro Herrera
Excerpts from Mike Fowler's message of mar jun 29 06:37:28 -0400 2010:

 After seeing some other posts in the last couple of days, I realised I 
 hadn't documented the function in the SGML. I have now done so, and 
 added a couple of tests with XML literals. Please find the patch 
 attached. Now time to go correct the xmlexists patch too...

Hmm, is 0 a valid node number in a xmlNodeSet?  If it is, searching for
the zeroth node would make the code call PG_RETURN_BOOL(0), no?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-08-05 Thread Peter Eisentraut
On tis, 2010-07-27 at 16:33 -0700, David Fetter wrote:
 * Do we already have it? 
 
 Not really.  There are kludges to accomplish these things, but
 they're available mostly in the sense that a general-purpose
 language allows you to write code to do anything a Turing machine
 can do.

I think this has been obsoleted by the xmlexists patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-07-27 Thread David Fetter
== Submission review ==

* Is the patch in context diff format?

Yes.

* Does it apply cleanly to the current CVS HEAD?

Yes.

patch -p1  ../xpath_exists-3.patch 
patching file doc/src/sgml/func.sgml
Hunk #1 succeeded at 8642 (offset 16 lines).
patching file src/backend/utils/adt/xml.c
patching file src/include/catalog/pg_proc.h
Hunk #1 succeeded at 4391 (offset 6 lines).
patching file src/include/utils/xml.h
patching file src/test/regress/expected/xml.out
patching file src/test/regress/sql/xml.sql

* Does it include reasonable tests, necessary doc patches, etc?

Tests:

As this is new functionality, it doesn't really need to test
much for interactions with other parts of the system.

I'm not really an XML expert, so I'd like to punt as to
whether it tests enough functionality.

Minor quibble with the regression tests: should we be using
dollar quotes in things like this?  Doubled-up quote marks:

SELECT xpath_exists('//town[text() = 
''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);

Dollar quote:

SELECT xpath_exists($$//town[text() = 
'Cwmbran']$$,'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);


Doc patches: Good up to cross-Atlantic differences in spelling
(speciali[sz]ed), e.g.

== Usability review ==  

Read what the patch is supposed to do, and consider:

* Does the patch actually implement that? 

Yes.

* Do we want that? 

Yes.

* Do we already have it? 

Not really.  There are kludges to accomplish these things, but
they're available mostly in the sense that a general-purpose
language allows you to write code to do anything a Turing machine
can do.

* Does it follow SQL spec, or the community-agreed behavior? 

Yes.

* Does it include pg_dump support (if applicable)?

Not applicable.

* Are there dangers? 

Not that I can see.

* Have all the bases been covered?

To the extent of my XML knowledge, yes.

== Feature test ==

Apply the patch, compile it and test:

* Does the feature work as advertised?

Yes.

* Are there corner cases the author has failed to consider?

Not that I've found.  See above re: XML and my vast ignorance of
same.

* Are there any assertion failures or crashes?

No.

== Performance review ==

* Does the patch slow down simple tests? 

No.

* If it claims to improve performance, does it?

No such claim made.  The kludges needed to reproduce the
functionality would certainly consume an enormous number of
developer hours, though.

* Does it slow down other things?

Not that I've found.  There might be some minuscule slowing down
of the code to the existence of more code paths, but we're a long,
long way from having that be something other than noise.

== Coding review ==

Read the changes to the code in detail and consider:

* Does it follow the project 
[http://developer.postgresql.org/pgdocs/postgres/source.html coding 
guidelines]? 

Yes.

* Are there portability issues? 

Not that I can see.

* Will it work on Windows/BSD etc? 

Should do.

* Are the comments sufficient and accurate?

Yes.

* Does it do what it says, correctly?

Yes, subject to, etc.

* Does it produce compiler warnings?

No.

* Can you make it crash?

No.

== Architecture review ==

Consider the changes to the code in the context of the project as a whole:

* Is everything done in a way that fits together coherently with other 
features/modules? 

Yes.

* Are there interdependencies that can cause problems?

Not that I've found.

Cheers,
David.
On Tue, Jun 29, 2010 at 11:37:28AM +0100, Mike Fowler wrote:
 Mike Fowler wrote:
 Bruce Momjian wrote:
 I have added this to the next commit-fest:
 
 https://commitfest.postgresql.org/action/commitfest_view?id=6
 Thanks Bruce. Attached is a revised patch which changes the code
 slightly such that it uses an older version of the libxml library.
 I've added comments to the code so that we remember why we didn't
 use the latest function.
 
 After seeing some other posts in the last couple of days, I realised
 I hadn't documented the function in the SGML. I have now done so,
 and added a couple of tests with XML literals. Please find the patch
 attached. Now time to go correct the xmlexists patch too...
 
 -- 
 Mike Fowler
 Registered Linux user: 379787
 

 *** a/doc/src/sgml/func.sgml
 --- b/doc/src/sgml/func.sgml
 ***
 *** 8626,8631  SELECT xpath('/my:a/text()', 'my:a 
 xmlns:my=http://example.com;test/my:a',
 --- 8626,8664 
   (1 row)
   ]]/screen
  /para
 +
 +sect3
 + titlexpath_exists/title
 + 
 + indexterm
 +  primaryxpath_exists/primary
 + /indexterm
 + 
 + synopsis
 +  functionxpath_exists/function(replaceablexpath/replaceable, 
 replaceablexml/replaceableoptional, 
 

Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-07-27 Thread Robert Haas
On Tue, Jul 27, 2010 at 7:33 PM, David Fetter da...@fetter.org wrote:
        Minor quibble with the regression tests: should we be using
        dollar quotes in things like this?  Doubled-up quote marks:

        SELECT xpath_exists('//town[text() = 
 ''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);

        Dollar quote:

        SELECT xpath_exists($$//town[text() = 
 'Cwmbran']$$,'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);

Personally, I don't really see that as an improvement.  Dollar-quotes
are really nice for longer strings, or where you would otherwise have
quadrupled quotes (or more), but I don't see a big advantage to it
here.  Still, it's a question of opinion more than anything else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [RRR] Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-07-27 Thread Jeff Davis
On Tue, 2010-07-27 at 19:41 -0400, Robert Haas wrote:
 On Tue, Jul 27, 2010 at 7:33 PM, David Fetter da...@fetter.org wrote:
 Minor quibble with the regression tests: should we be using
 dollar quotes in things like this?  Doubled-up quote marks:
 
 SELECT xpath_exists('//town[text() = 
  ''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);
 
 Dollar quote:
 
 SELECT xpath_exists($$//town[text() = 
  'Cwmbran']$$,'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml);
 
 Personally, I don't really see that as an improvement.  Dollar-quotes
 are really nice for longer strings, or where you would otherwise have
 quadrupled quotes (or more), but I don't see a big advantage to it
 here.  Still, it's a question of opinion more than anything else.

I like the idea of using dollar quotes, but I think they should be used
for both arguments (or neither). Using $$ for one and then shifting to
' for the second argument with no whitespace at all seems like the
least readable option.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCH] Re: [HACKERS] Adding xpath_exists function

2010-06-29 Thread Mike Fowler

Mike Fowler wrote:

Bruce Momjian wrote:

I have added this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view?id=6   
Thanks Bruce. Attached is a revised patch which changes the code 
slightly such that it uses an older version of the libxml library. 
I've added comments to the code so that we remember why we didn't use 
the latest function.


After seeing some other posts in the last couple of days, I realised I 
hadn't documented the function in the SGML. I have now done so, and 
added a couple of tests with XML literals. Please find the patch 
attached. Now time to go correct the xmlexists patch too...


--
Mike Fowler
Registered Linux user: 379787

*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8626,8631  SELECT xpath('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a',
--- 8626,8664 
  (1 row)
  ]]/screen
 /para
+
+sect3
+ titlexpath_exists/title
+ 
+ indexterm
+  primaryxpath_exists/primary
+ /indexterm
+ 
+ synopsis
+  functionxpath_exists/function(replaceablexpath/replaceable, replaceablexml/replaceableoptional, replaceablensarray/replaceable/optional)
+ /synopsis
+ 
+ para
+  The function functionxpath_exists/function is a specialised form
+  of the functionxpath/function function. Though the functions are
+  syntactically the same the xpath expressions are evaluated in differing
+  contexts. Instead of returning the XML values that satisfy the xpath, this
+  function returns a boolean indicating whether the query was satisfied or not.
+ /para
+ 
+ para
+  Example:
+  screen![CDATA[
+ SELECT xpath_exists('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', 
+ ARRAY[ARRAY['my', 'http://example.com']]);
+ 
+  xpath_exists  
+ 
+  t
+ (1 row)
+ ]]/screen
+ /para
+/sect3
/sect2
  
sect2 id=functions-xml-mapping
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3495,3497  xpath(PG_FUNCTION_ARGS)
--- 3495,3681 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	ArrayBuildState *astate = NULL;
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	xmlXPathObjectPtr xpathobj = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			res_nitems;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two 

[PATCH] Re: [HACKERS] Adding xpath_exists function

2010-06-27 Thread Mike Fowler

Bruce Momjian wrote:

I have added this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view?id=6

  
Thanks Bruce. Attached is a revised patch which changes the code 
slightly such that it uses an older version of the libxml library. I've 
added comments to the code so that we remember why we didn't use the 
latest function.


Regards,

--
Mike Fowler
Registered Linux user: 379787



*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 3495,3497  xpath(PG_FUNCTION_ARGS)
--- 3495,3681 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	ArrayBuildState *astate = NULL;
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	xmlXPathObjectPtr xpathobj = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			res_nitems;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two parsings for the same value during one
+ 		 * command execution are possible)
+ 		 */
+ 		ctxt = xmlNewParserCtxt();
+ 		if (ctxt == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate parser context);
+ 		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+ 		if (doc == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ 		could not parse XML document);
+ 		xpathctx = xmlXPathNewContext(doc);
+ 		if (xpathctx == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate XPath context);
+ 		xpathctx-node = xmlDocGetRootElement(doc);
+ 		if (xpathctx-node == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+ 		could not find root XML element);
+ 
+ 		/* register namespaces, if any */
+ 		if (ns_count  0)
+ 		{
+ 			for (i = 0; i  ns_count; i++)
+ 			{
+ char	   *ns_name;
+ char	   *ns_uri;
+ 
+ if (ns_names_uris_nulls[i * 2] ||
+ 	ns_names_uris_nulls[i * 2 + 1])
+ 	ereport(ERROR,
+ 			(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ 	  errmsg(neither namespace name nor URI may be null)));
+ ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+ ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
+ if (xmlXPathRegisterNs(xpathctx,
+ 	   (xmlChar *) ns_name,
+ 	   (xmlChar *) ns_uri) != 0)
+ 	ereport(ERROR,		/* is this an internal error??? */
+ 			(errmsg(could not register XML namespace with name \%s\ and URI \%s\,
+ 	ns_name, ns_uri)));
+ 			}
+ 		}
+ 
+ 		xpathcomp = 

Re: [HACKERS] Adding xpath_exists function

2010-05-31 Thread Bruce Momjian
Mike Fowler wrote:
 Robert Haas wrote:
  Please email your patch to the list (replying to this email is fine)
  and add it here:
  https://commitfest.postgresql.org/action/commitfest_view/open

 Here's my patch, developed against HEAD, that adds the function 
 'xpath_exists'. The function is a lot simpler than originally thought, 
 so none of the string manipulation previously discussed was required. 
 I've also included some regression tests that test the function with and 
 without xml namespaces. I should  note that before I added my tests all 
 existing tests passed.
 
 One observation that can be made is that I've largely copied the 
 existing xpath function and altered it to use a different method from 
 the libxml API. I've done it to save me redoing all the namespace 
 handling, however it's apparent to me that if we wanted to expose more 
 of the libxml api we will quickly start having a lot of duplicate code. 
 I notice that refactoring existing code whilst adding new code is 
 generally frowned upon, so once this patch is accepted I will look to 
 refactor the xpath and xpath_exists function. I could even add an 
 xpath_count method at the same time ;) .
 
 Thanks in advance for any and all feedback,

I have added this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view?id=6

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding xpath_exists function

2010-05-11 Thread Mike Fowler

Robert Haas wrote:

Please email your patch to the list (replying to this email is fine)
and add it here:
https://commitfest.postgresql.org/action/commitfest_view/open
  
Here's my patch, developed against HEAD, that adds the function 
'xpath_exists'. The function is a lot simpler than originally thought, 
so none of the string manipulation previously discussed was required. 
I've also included some regression tests that test the function with and 
without xml namespaces. I should  note that before I added my tests all 
existing tests passed.


One observation that can be made is that I've largely copied the 
existing xpath function and altered it to use a different method from 
the libxml API. I've done it to save me redoing all the namespace 
handling, however it's apparent to me that if we wanted to expose more 
of the libxml api we will quickly start having a lot of duplicate code. 
I notice that refactoring existing code whilst adding new code is 
generally frowned upon, so once this patch is accepted I will look to 
refactor the xpath and xpath_exists function. I could even add an 
xpath_count method at the same time ;) .


Thanks in advance for any and all feedback,

--
Mike Fowler
Registered Linux user: 379787

I could be a genius if I just put my mind to it, and I,
I could do anything, if only I could get 'round to it
-PULP 'Glory Days'

Index: src/backend/utils/adt/xml.c
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.97
diff -c -r1.97 xml.c
*** src/backend/utils/adt/xml.c	3 Mar 2010 17:29:45 -	1.97
--- src/backend/utils/adt/xml.c	11 May 2010 07:54:53 -
***
*** 3495,3497 
--- 3495,3670 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xpath_exists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	ArrayBuildState *astate = NULL;
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			res_nitems;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, 0-dimensional), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg(invalid array for XML namespace mapping),
+ 	 errdetail(The array must be two-dimensional with length of the second axis equal to 2.)));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  ns_names_uris, ns_names_uris_nulls,
+ 		  ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg(empty XPath expression)));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two parsings for the same value during one
+ 		 * command execution are possible)
+ 		 */
+ 		ctxt = xmlNewParserCtxt();
+ 		if (ctxt == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate parser context);
+ 		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+ 		if (doc == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ 		could not 

Re: [HACKERS] Adding xpath_exists function

2010-05-07 Thread Mike Fowler

Robert Haas wrote:

Oh, I see.  Well, that might be reasonable syntactic sugar, although I
think you should make it wrap the path in exists() unconditionally,
rather than testing for an existing wrap.
  


I'll leave it out for now, it saves me some effort after all.


Please email your patch to the list (replying to this email is fine)
and add it here:
https://commitfest.postgresql.org/action/commitfest_view/open
  


Will do once I've finished. Thanks for your feedback.

Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding xpath_exists function

2010-05-06 Thread Robert Haas
On Thu, May 6, 2010 at 10:10 AM, Mike Fowler m...@mlfowler.com wrote:
 Hi hackers,

 Although this is a very small change I figured I'd practice the policy of
 outlining your change before you write the code and attempt a patch
 submission. Essentially I see the function as a convenience function that
 exposes the results of the xpath built in exists() function as a boolean for
 easier SQL. The syntax will match the xpath function already present:

 xpath_exists(xpath, xml[, nsarray])

 The implementation will check that the xpath value contains 'exists(.)'
 and add it if not present for added usability. I can't blindly wrap the
 value with exists(...) as exists(exists(.)) will always return true in
 xpath. I've not allocated the oid yet, but will try to earn my bonus points
 for getting it close to xpath()'s oid. :) Appropriate regression tests will
 be added after I've ensured that 'make check' is happy I've not regressed
 anything.

I'm not sure I understand how this more convenient than just using
xpath() with exists()?

 Can I confirm that contrib/xml2 is deprecated and I should be carrying out
 my work in backend/utils/adt/xml.c?

Yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding xpath_exists function

2010-05-06 Thread mike

Quoting Robert Haas robertmh...@gmail.com:



I'm not sure I understand how this more convenient than just using
xpath() with exists()?



It will save a lot of complexity in WHERE clauses. For example using  
exists() in xpath() you might construct something like:


WHERE array_dims(xpath('exists(/foo/bar)','barfoo//bar'::xml) IS  
NOT NULL ...


Whereas a dedicated xpath_exists() would look like:

WHERE xpath_exists('/foo/bar','barfoo//bar'::xml) 

I accept this example is quite basic, but I hope it illustrates the  
added usability. I think xml in sql is complex enough, especially when  
you start considering namespaces, that anything we can do to simplify  
common use cases can only help improve the uptake of postgres xml.


Thanks,

--
Mike Fowler


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding xpath_exists function

2010-05-06 Thread Robert Haas
On Thu, May 6, 2010 at 5:53 PM,  m...@mlfowler.com wrote:
 Quoting Robert Haas robertmh...@gmail.com:


 I'm not sure I understand how this more convenient than just using
 xpath() with exists()?


 It will save a lot of complexity in WHERE clauses. For example using
 exists() in xpath() you might construct something like:

 WHERE array_dims(xpath('exists(/foo/bar)','barfoo//bar'::xml) IS NOT
 NULL ...

 Whereas a dedicated xpath_exists() would look like:

 WHERE xpath_exists('/foo/bar','barfoo//bar'::xml) 

 I accept this example is quite basic, but I hope it illustrates the added
 usability. I think xml in sql is complex enough, especially when you start
 considering namespaces, that anything we can do to simplify common use cases
 can only help improve the uptake of postgres xml.

Oh, I see.  Well, that might be reasonable syntactic sugar, although I
think you should make it wrap the path in exists() unconditionally,
rather than testing for an existing wrap.

Please email your patch to the list (replying to this email is fine)
and add it here:
https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers