Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2015-01-18 Thread Ali Akbar
2015-01-18 10:44 GMT+07:00 Peter Eisentraut pete...@gmx.net:

 On 1/7/15 8:51 PM, Ali Akbar wrote:
  So now +1 for back-patching this.

 Done.  Was a bit of work to get it into all the old versions.


Wow. Thanks.

Btw, for bug-fix patches like this, should the patch creator (me) also
create patches for back branches?

Regards,
-- 
Ali Akbar


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2015-01-18 Thread David Fetter
On Sun, Jan 18, 2015 at 06:05:05PM +0700, Ali Akbar wrote:
 2015-01-18 10:44 GMT+07:00 Peter Eisentraut pete...@gmx.net:
 
  On 1/7/15 8:51 PM, Ali Akbar wrote:
   So now +1 for back-patching this.
 
  Done.  Was a bit of work to get it into all the old versions.
 
 
 Wow. Thanks.
 
 Btw, for bug-fix patches like this, should the patch creator (me) also
 create patches for back branches?

As I understand it, back-patches are the committer's responsibility.

The submitter might make suggestions as to how this might be
approached if it doesn't appear trivial.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2015-01-18 Thread Michael Paquier
On Mon, Jan 19, 2015 at 2:38 AM, David Fetter da...@fetter.org wrote:
 On Sun, Jan 18, 2015 at 06:05:05PM +0700, Ali Akbar wrote:
 2015-01-18 10:44 GMT+07:00 Peter Eisentraut pete...@gmx.net:
 Btw, for bug-fix patches like this, should the patch creator (me) also
 create patches for back branches?

 As I understand it, back-patches are the committer's responsibility.
 The submitter might make suggestions as to how this might be
 approached if it doesn't appear trivial.
TBH, I would imagine that patches that can be applied to back-branches
are a better start point than plain scratch particularly if there are
diffs in stable branches compared to HEAD. Everybody's time is
important.
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2015-01-18 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 On Mon, Jan 19, 2015 at 2:38 AM, David Fetter da...@fetter.org wrote:
 On Sun, Jan 18, 2015 at 06:05:05PM +0700, Ali Akbar wrote:
 2015-01-18 10:44 GMT+07:00 Peter Eisentraut pete...@gmx.net:
 Btw, for bug-fix patches like this, should the patch creator (me) also
 create patches for back branches?

 As I understand it, back-patches are the committer's responsibility.
 The submitter might make suggestions as to how this might be
 approached if it doesn't appear trivial.

 TBH, I would imagine that patches that can be applied to back-branches
 are a better start point than plain scratch particularly if there are
 diffs in stable branches compared to HEAD. Everybody's time is
 important.

Yeah --- and I'd argue that it's largely a waste of time to work on
creating back-branch patches until the HEAD patch is in final form.
Since we've generally reserved the right for the committer to whack
patches around before committing, I think this means the committer
also has to do the work of back-patch adjustment.

Now a committer who doesn't feel like doing that could certainly say
here's the version of the HEAD patch that I'm willing to commit, but
it doesn't apply cleanly in back branches; could you work up back-branch
equivalents?.  But that hasn't been the usual approach.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2015-01-17 Thread Peter Eisentraut
On 1/7/15 8:51 PM, Ali Akbar wrote:
 So now +1 for back-patching this.

Done.  Was a bit of work to get it into all the old versions.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2015-01-06 Thread Peter Eisentraut
committed version 7


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2015-01-06 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 committed version 7

Isn't that a back-patchable bug fix?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar

 I ran a test using postgres-US.fo built in the PostgreSQL source tree,
 which is 38 MB, and ran

 select unnest(xpath('//fo:bookmark-title', b, array[array['fo',
 'http://www.w3.org/1999/XSL/Format']])) from data;

 (Table contains one row only.)

 The timings were basically indistinguishable between the three code
 versions.

 I'll try to reproduce your test.  How big is your file?  Do you have a
 link to the actual file?  Could you share your load script?


I use this xml sample:
http://w3.energistics.org/schema/prodml_v1.2.0_data/XML_Examples/productVolume_no_xsl.xml

Basically i loaded the xml to table u 100 times. Load script attached.

Regards,
-- 
Ali Akbar


load_test.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar
2014-12-14 22:18 GMT+07:00 Ali Akbar the.ap...@gmail.com:


 I ran a test using postgres-US.fo built in the PostgreSQL source tree,
 which is 38 MB, and ran

 select unnest(xpath('//fo:bookmark-title', b, array[array['fo',
 'http://www.w3.org/1999/XSL/Format']])) from data;

 (Table contains one row only.)

 The timings were basically indistinguishable between the three code
 versions.

 I'll try to reproduce your test.  How big is your file?  Do you have a
 link to the actual file?  Could you share your load script?


 I use this xml sample:
 http://w3.energistics.org/schema/prodml_v1.2.0_data/XML_Examples/productVolume_no_xsl.xml

 Basically i loaded the xml to table u 100 times. Load script attached.


Peter, while reviewing the better performing patch myself, now i think the
patch needs more work to be committed. The structuring of the method will
be confusing in the long term. I think i'll restructure the patch in the
next commitfest.

So i propose to break the patch:
1. We apply the current patch which uses xmlNodeCopy, so that the
long-standing bug will be fixed in postgres.
2. I'll work with the performance enhancement in the next commitfest.

Maybe for (2), the current better-performing patch can be viewed as PoC of
the expected performance.

Regards,
-- 
Ali Akbar


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Michael Paquier
On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote:
 Peter, while reviewing the better performing patch myself, now i think the
 patch needs more work to be committed. The structuring of the method will be
 confusing in the long term. I think I'll restructure the patch in the next
 commitfest.
 So i propose to break the patch:
 1. We apply the current patch which uses xmlNodeCopy, so that the
 long-standing bug will be fixed in postgres.
 2. I'll work with the performance enhancement in the next commitfest.

 Maybe for (2), the current better-performing patch can be viewed as PoC of
 the expected performance.

Ali, are you currently working on that? Would you mind re-creating new
entries in the commit fest app for the new set of patches that you are
planning to do?
For now I am switching this patch as returned with feedback.
Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar
2014-12-15 10:19 GMT+07:00 Michael Paquier michael.paqu...@gmail.com:

 On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote:
  Peter, while reviewing the better performing patch myself, now i think
 the
  patch needs more work to be committed. The structuring of the method
 will be
  confusing in the long term. I think I'll restructure the patch in the
 next
  commitfest.
  So i propose to break the patch:
  1. We apply the current patch which uses xmlNodeCopy, so that the
  long-standing bug will be fixed in postgres.
  2. I'll work with the performance enhancement in the next commitfest.
 
  Maybe for (2), the current better-performing patch can be viewed as PoC
 of
  the expected performance.

 Ali, are you currently working on that? Would you mind re-creating new
 entries in the commit fest app for the new set of patches that you are
 planning to do?
 For now I am switching this patch as returned with feedback.
 Thanks,


What i mean, the last patch (v7 patch) as it is is enough to fix the bug
(nested xpath namespace problem). I think the performance regression is
still acceptable (in my case it's ~20%), because the bug is severe.
Currently, xpath can return invalid xml because the namespace isn't
included in the output!

What i'll be working is the v4 patch, because it turns out the v4 patch has
better performance (~10%, but Peter's test shows it isn't the case). But,
the problem is the v4 patch is organized wrongly, and hacks around the
libxml's xml node structure (duplicating the namespace on the existing
structure). I'll work on that, but it will affects the performance benefit.

So what i propose is, we close the longstanding bug in this comitfest with
the v7 patch. I'll work on improving the performance, without compromising
good code structure. If the result is good, i'll submit the patch.

Thanks

Regards,
-- 
Ali Akbar


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar
2014-12-15 11:02 GMT+07:00 Ali Akbar the.ap...@gmail.com:

 2014-12-15 10:19 GMT+07:00 Michael Paquier michael.paqu...@gmail.com:

 On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote:
  Peter, while reviewing the better performing patch myself, now i think
 the
  patch needs more work to be committed. The structuring of the method
 will be
  confusing in the long term. I think I'll restructure the patch in the
 next
  commitfest.
  So i propose to break the patch:
  1. We apply the current patch which uses xmlNodeCopy, so that the
  long-standing bug will be fixed in postgres.
  2. I'll work with the performance enhancement in the next commitfest.
 
  Maybe for (2), the current better-performing patch can be viewed as PoC
 of
  the expected performance.

 Ali, are you currently working on that? Would you mind re-creating new
 entries in the commit fest app for the new set of patches that you are
 planning to do?
 For now I am switching this patch as returned with feedback.
 Thanks,


 What i mean, the last patch (v7 patch) as it is is enough to fix the bug
 (nested xpath namespace problem). I think the performance regression is
 still acceptable (in my case it's ~20%), because the bug is severe.
 Currently, xpath can return invalid xml because the namespace isn't
 included in the output!


Sorry, typo here. What i mean isn't invalid xml, but incomplete xml.
Hence the next call to xpath with the previous result as its input will
become a problem because the namespace will not match.



 What i'll be working is the v4 patch, because it turns out the v4 patch
 has better performance (~10%, but Peter's test shows it isn't the case).
 But, the problem is the v4 patch is organized wrongly, and hacks around the
 libxml's xml node structure (duplicating the namespace on the existing
 structure). I'll work on that, but it will affects the performance benefit.

 So what i propose is, we close the longstanding bug in this comitfest with
 the v7 patch. I'll work on improving the performance, without compromising
 good code structure. If the result is good, i'll submit the patch.

 Thanks


Regards,
-- 
Ali Akbar


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Michael Paquier
On Mon, Dec 15, 2014 at 1:02 PM, Ali Akbar the.ap...@gmail.com wrote:
 2014-12-15 10:19 GMT+07:00 Michael Paquier michael.paqu...@gmail.com:

 On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote:
  Peter, while reviewing the better performing patch myself, now i think
  the
  patch needs more work to be committed. The structuring of the method
  will be
  confusing in the long term. I think I'll restructure the patch in the
  next
  commitfest.
  So i propose to break the patch:
  1. We apply the current patch which uses xmlNodeCopy, so that the
  long-standing bug will be fixed in postgres.
  2. I'll work with the performance enhancement in the next commitfest.
 
  Maybe for (2), the current better-performing patch can be viewed as PoC
  of
  the expected performance.

 Ali, are you currently working on that? Would you mind re-creating new
 entries in the commit fest app for the new set of patches that you are
 planning to do?
 For now I am switching this patch as returned with feedback.
 Thanks,


 What i mean, the last patch (v7 patch) as it is is enough to fix the bug
 (nested xpath namespace problem). I think the performance regression is
 still acceptable (in my case it's ~20%), because the bug is severe.
 Currently, xpath can return invalid xml because the namespace isn't included
 in the output!

 What i'll be working is the v4 patch, because it turns out the v4 patch has
 better performance (~10%, but Peter's test shows it isn't the case). But,
 the problem is the v4 patch is organized wrongly, and hacks around the
 libxml's xml node structure (duplicating the namespace on the existing
 structure). I'll work on that, but it will affects the performance benefit.

 So what i propose is, we close the longstanding bug in this comitfest with
 the v7 patch. I'll work on improving the performance, without compromising
 good code structure. If the result is good, i'll submit the patch.
OK. Could you then move this patch to the new CF with Needs Review
with Peter as reviewer then? He seems to be looking at it anyway
seeing the update from 12/11.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar
2014-12-15 11:06 GMT+07:00 Michael Paquier michael.paqu...@gmail.com:

 On Mon, Dec 15, 2014 at 1:02 PM, Ali Akbar the.ap...@gmail.com wrote:
  2014-12-15 10:19 GMT+07:00 Michael Paquier michael.paqu...@gmail.com:
 
  On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote:
   Peter, while reviewing the better performing patch myself, now i think
   the
   patch needs more work to be committed. The structuring of the method
   will be
   confusing in the long term. I think I'll restructure the patch in the
   next
   commitfest.
   So i propose to break the patch:
   1. We apply the current patch which uses xmlNodeCopy, so that the
   long-standing bug will be fixed in postgres.
   2. I'll work with the performance enhancement in the next commitfest.
  
   Maybe for (2), the current better-performing patch can be viewed as
 PoC
   of
   the expected performance.
 
  Ali, are you currently working on that? Would you mind re-creating new
  entries in the commit fest app for the new set of patches that you are
  planning to do?
  For now I am switching this patch as returned with feedback.
  Thanks,
 
 
  What i mean, the last patch (v7 patch) as it is is enough to fix the bug
  (nested xpath namespace problem). I think the performance regression is
  still acceptable (in my case it's ~20%), because the bug is severe.
  Currently, xpath can return invalid xml because the namespace isn't
 included
  in the output!
 
  What i'll be working is the v4 patch, because it turns out the v4 patch
 has
  better performance (~10%, but Peter's test shows it isn't the case). But,
  the problem is the v4 patch is organized wrongly, and hacks around the
  libxml's xml node structure (duplicating the namespace on the existing
  structure). I'll work on that, but it will affects the performance
 benefit.
 
  So what i propose is, we close the longstanding bug in this comitfest
 with
  the v7 patch. I'll work on improving the performance, without
 compromising
  good code structure. If the result is good, i'll submit the patch.
 OK. Could you then move this patch to the new CF with Needs Review

with Peter as reviewer then? He seems to be looking at it anyway
 seeing the update from 12/11.


OK. Moved to the new CF.

Regards,
-- 
Ali Akbar


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-11 Thread Peter Eisentraut
On 11/5/14 9:50 AM, Ali Akbar wrote:
 I noticed somewhat big performance regression if the xml is large (i use
 PRODML Product Volume sample from energistics.org http://energistics.org):
 * Without patch (tested 3 times):
 select unnest(xpath('//a:flow', x,
 ARRAY[['a','http://www.prodml.org/schemas/1series']])) from u;

 Time: 84,012 ms
 Time: 85,683 ms
 Time: 88,766 ms

 * With latest v6 patch (notice the correct result with namespace
 definition):
 
 select unnest(xpath('//a:flow', x,
 ARRAY[['a','http://www.prodml.org/schemas/1series']])) from u;

 Time: 108,912 ms
 Time: 108,267 ms
 Time: 114,848 ms
 
 
 It's 23% performance regression.
 
 * Just curious, i'm also testing v5 patch performance (notice the
 namespace in the result):
 select unnest(xpath('//a:flow', x,
 ARRAY[['a','http://www.prodml.org/schemas/1series']])) from u;

 Time: 92,552 ms
 Time: 97,440 ms
 Time: 99,309 ms
 
 The regression is only 13%. I know the xmlCopyNode() version (v6 patch)
 is much more cleaner than v5patch, should we consider the performance
 benefit?

I ran a test using postgres-US.fo built in the PostgreSQL source tree,
which is 38 MB, and ran

select unnest(xpath('//fo:bookmark-title', b, array[array['fo',
'http://www.w3.org/1999/XSL/Format']])) from data;

(Table contains one row only.)

The timings were basically indistinguishable between the three code
versions.

I'll try to reproduce your test.  How big is your file?  Do you have a
link to the actual file?  Could you share your load script?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-11-05 Thread Ali Akbar
2014-11-04 22:16 GMT+07:00 Peter Eisentraut pete...@gmx.net:


 On 10/6/14 10:24 PM, Ali Akbar wrote:
  While reviewing the patch myself, i spotted some formatting problems in
  the code. Fixed in this v5 patch.
 
  Also, this patch uses context patch format (in first versions, because
  of the small modification, context patch format obfucates the changes.
  After reimplementation this isn't the case anymore)

 I think the problem this patch is addressing is real, and your approach
 is sound, but I'd ask you to go back to the xmlCopyNode() version, and
 try to add a test case for why the second argument = 1 is necessary.  I
 don't see any other problems.


 OK. Because upstream code is fixed in current version, i'll revert to the
 previous version. Test case added to regression test. With =1 argument, the
 result is correct:
 local:piece xmlns:local=\http://127.0.0.1\; xmlns=\http://127.0.0.2\;
 id=\1\
internalnumber one/internal
internal2/
  /local:piece

 without the argument, the result is not correct, all children will be
 lost. Because of that, the other regression test will fail too because the
 children is not copied:
 *** 584,593 

   -- Text XPath expressions evaluation
   SELECT xpath('/value', data) FROM xmltest;
 ! xpath
 ! --
 !  {valueone/value}
 !  {valuetwo/value}
   (2 rows)

   SELECT xpath(NULL, NULL) IS NULL FROM xmltest;
 --- 584,593 

   -- Text XPath expressions evaluation
   SELECT xpath('/value', data) FROM xmltest;
 !xpath
 ! 
 !  {value/}
 !  {value/}
   (2 rows)

   SELECT xpath(NULL, NULL) IS NULL FROM xmltest;
 ***
 ... cut

 updated patch attached.


I noticed somewhat big performance regression if the xml is large (i use
PRODML Product Volume sample from energistics.org):
* Without patch (tested 3 times):
select unnest(xpath('//a:flow', x, ARRAY[['a','
http://www.prodml.org/schemas/1series']])) from u;

unnest
---
 flow
+
 kindgas
lift/kind+
...
Time: 84,012 ms
Time: 85,683 ms
Time: 88,766 ms


* With latest v6 patch (notice the correct result with namespace
definition):

select unnest(xpath('//a:flow', x, ARRAY[['a','
http://www.prodml.org/schemas/1series']])) from u;

unnest
---
 flow xmlns=http://www.prodml.org/schemas/1series;
+
...
Time: 108,912 ms
Time: 108,267 ms
Time: 114,848 ms


It's 23% performance regression.

* Just curious, i'm also testing v5 patch performance (notice the namespace
in the result):
select unnest(xpath('//a:flow', x, ARRAY[['a','
http://www.prodml.org/schemas/1series']])) from u;

unnest
---
 flow xmlns=http://www.prodml.org/schemas/1series;
+
 kindgas
lift/kind+
Time: 92,552 ms
Time: 97,440 ms
Time: 99,309 ms

The regression is only 13%. I know the xmlCopyNode() version (v6 patch) is
much more cleaner than v5patch, should we consider the performance benefit?

Anyway, thanks for the review! :)

Regards,
--
Ali Akbar


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-11-04 Thread Peter Eisentraut
On 7/11/14 4:36 AM, Ali Akbar wrote:
 But i found some bug in libxml2's code, because we call xmlCopyNode with
 1 as the second parameter, internally xmlCopyNode will call
 xmlStaticCopyNode recursively via xmlStaticCopyNodeList. And
 xmlStaticCopyNodeList doesn't check the return of xmlStaticCopyNode
 whether it's NULL. So if someday the recursion returns NULL (maybe
 because of memory exhaustion), it will SEGFAULT.
 
 Knowing this but in libxml2 code, is this patch is still acceptable?

This problem was fixed in libxml2 2.9.2 (see
https://bugzilla.gnome.org/show_bug.cgi?id=708681).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-11-04 Thread Peter Eisentraut
On 7/8/14 6:03 AM, Ali Akbar wrote:
 If we put 1 as the parameter, the resulting Node will link to the
 existing children. In this case, the namespace problem for the children
 might be still exists. If any of the children uses different
 namespace(s) than the parent, the namespace definition will not be
 copied correctly.

This analysis might very well be right, but I think we should prove it
with a test case.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-11-04 Thread Peter Eisentraut
On 10/6/14 10:24 PM, Ali Akbar wrote:
 While reviewing the patch myself, i spotted some formatting problems in
 the code. Fixed in this v5 patch.
 
 Also, this patch uses context patch format (in first versions, because
 of the small modification, context patch format obfucates the changes.
 After reimplementation this isn't the case anymore)

I think the problem this patch is addressing is real, and your approach
is sound, but I'd ask you to go back to the xmlCopyNode() version, and
try to add a test case for why the second argument = 1 is necessary.  I
don't see any other problems.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-10-06 Thread Ali Akbar
While reviewing the patch myself, i spotted some formatting problems in the
code. Fixed in this v5 patch.

Also, this patch uses context patch format (in first versions, because of
the small modification, context patch format obfucates the changes. After
reimplementation this isn't the case anymore)

Thanks,
Ali Akbar
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 141,149  static bool print_xml_decl(StringInfo buf, const xmlChar *version,
  			   pg_enc encoding, int standalone);
  static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg,
  		  bool preserve_whitespace, int encoding);
! static text *xml_xmlnodetoxmltype(xmlNodePtr cur);
  static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate);
  #endif   /* USE_LIBXML */
  
  static StringInfo query_to_xml_internal(const char *query, char *tablename,
--- 141,151 
  			   pg_enc encoding, int standalone);
  static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg,
  		  bool preserve_whitespace, int encoding);
! static text *xml_xmlnodetoxmltype(xmlNodePtr cur,
! 	   PgXmlErrorContext *xmlerrcxt);
  static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate,
! 	   PgXmlErrorContext *xmlerrcxt);
  #endif   /* USE_LIBXML */
  
  static StringInfo query_to_xml_internal(const char *query, char *tablename,
***
*** 3594,3620  SPI_sql_row_to_xmlelement(int rownum, StringInfo result, char *tablename,
  
  #ifdef USE_LIBXML
  
  /*
   * Convert XML node to text (dump subtree in case of element,
   * return value otherwise)
   */
  static text *
! xml_xmlnodetoxmltype(xmlNodePtr cur)
  {
  	xmltype*result;
  
  	if (cur-type == XML_ELEMENT_NODE)
  	{
  		xmlBufferPtr buf;
  
  		buf = xmlBufferCreate();
  		PG_TRY();
  		{
  			xmlNodeDump(buf, NULL, cur, 0, 1);
  			result = xmlBuffer_to_xmltype(buf);
  		}
  		PG_CATCH();
  		{
  			xmlBufferFree(buf);
  			PG_RE_THROW();
  		}
--- 3596,3706 
  
  #ifdef USE_LIBXML
  
+ /* check ns definition of node and its childrens. If any one of ns is
+  * not defined in node and it's children, but in the node's parent,
+  * copy the definition to node.
+  */
+ static void
+ xml_checkandcopyns(xmlNodePtr root,
+    PgXmlErrorContext *xmlerrcxt,
+    xmlNodePtr node,
+    xmlNsPtr lastns_before)
+ {
+ 	xmlNsPtr ns = NULL;
+ 	xmlNsPtr cur_ns;
+ 	xmlNodePtr cur;
+ 
+ 	if (node-ns != NULL)
+ 	{
+ 		/* check in new nses */
+ 		cur_ns = lastns_before == NULL ? node-nsDef : lastns_before-next;
+ 		while (cur_ns != NULL)
+ 		{
+ 			if (cur_ns-href != NULL)
+ 			{
+ if (((cur_ns-prefix == NULL)  (node-ns-prefix == NULL)) ||
+ 	((cur_ns-prefix != NULL)  (node-ns-prefix != NULL) 
+ 	 xmlStrEqual(cur_ns-prefix, node-ns-prefix)))
+ {
+ 	ns = cur_ns;
+ 	break;
+ }
+ 			}
+ 			cur_ns = cur_ns-next;
+ 		}
+ 		if (ns == NULL) /* not in new nses */
+ 		{
+ 			ns = xmlSearchNs(NULL, node-parent, node-ns-prefix);
+ 
+ 			if (ns != NULL)
+ 			{
+ ns = xmlNewNs(root, ns-href, ns-prefix);
+ 
+ if (ns == NULL  xmlerrcxt-err_occurred)
+ 	xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not allocate xmlNs);
+ 			}
+ 		}
+ 	}
+ 	/* check and copy ns for children recursively */
+ 	cur = node-children;
+ 	while (cur != NULL)
+ 	{
+ 		xml_checkandcopyns(root, xmlerrcxt, cur, lastns_before);
+ 		cur = cur-next;
+ 	}
+ }
+ 
  /*
   * Convert XML node to text (dump subtree in case of element,
   * return value otherwise)
   */
  static text *
! xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
  {
  	xmltype*result;
  
  	if (cur-type == XML_ELEMENT_NODE)
  	{
  		xmlBufferPtr buf;
+ 		xmlNsPtr lastns_before;
+ 		xmlNsPtr ns;
+ 		xmlNsPtr next;
  
  		buf = xmlBufferCreate();
+ 
  		PG_TRY();
  		{
+ 			lastns_before = cur-nsDef;
+ 			if (lastns_before != NULL)
+ 			{
+ while (lastns_before-next != NULL)
+ 	lastns_before = lastns_before-next;
+ 			}
+ 			xml_checkandcopyns(cur, xmlerrcxt, cur, lastns_before);
+ 
  			xmlNodeDump(buf, NULL, cur, 0, 1);
  			result = xmlBuffer_to_xmltype(buf);
+ 
+ 			/* delete and free new nses */
+ 			ns = lastns_before == NULL ? cur-nsDef : lastns_before-next;
+ 			while (ns != NULL)
+ 			{
+ next = ns-next;
+ xmlFree(ns);
+ ns = next;
+ 			}
+ 
+ 			if (lastns_before == NULL)
+ cur-nsDef = NULL;
+ 			else
+ lastns_before-next = NULL;
  		}
  		PG_CATCH();
  		{
+ 			/* new namespaces will be freed while free-ing the node, so we
+ 			 * won't free it here
+ 			 */
  			xmlBufferFree(buf);
  			PG_RE_THROW();
  		}
***
*** 3660,3666  xml_xmlnodetoxmltype(xmlNodePtr cur)
   */
  static int
  xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate)
  {
  	int			result = 0;
  	Datum		datum;
--- 3746,3753 
   */
  static int
  xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	  

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-08-29 Thread Ali Akbar
Greetings,

Because of the memory bug in xmlCopyNode, this is a new patch with
different method. In this patch, instead of using xmlCopyNode to bring the
namespace back, we added the required namespaces to the  node before
dumping the node to string, and cleaning it up afterwards (because the same
node could be dumped again in next xpath result).

Thanks,
Ali Akbar


2014-07-11 15:36 GMT+07:00 Ali Akbar the.ap...@gmail.com:

 Greetings,

 Attached modified patch to handle xmlCopyNode returning NULL. The patch is
 larger because xmlerrcxt must be passed to xml_xmlnodetoxmltype (xmlerrcxt
 is needed for calling xml_ereport).

 From libxml2 source, it turns out that the other cases that xmlCopyNode
 will return NULL will not happen. So in this patch i assume that the only
 case is memory exhaustion.

 But i found some bug in libxml2's code, because we call xmlCopyNode with 1
 as the second parameter, internally xmlCopyNode will call xmlStaticCopyNode
 recursively via xmlStaticCopyNodeList. And xmlStaticCopyNodeList doesn't
 check the return of xmlStaticCopyNode whether it's NULL. So if someday the
 recursion returns NULL (maybe because of memory exhaustion), it will
 SEGFAULT.

 Knowing this but in libxml2 code, is this patch is still acceptable?

 Thanks,
 Ali Akbar




-- 
Ali Akbar
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 422be69..f34c87e 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -141,9 +141,11 @@ static bool print_xml_decl(StringInfo buf, const xmlChar *version,
 			   pg_enc encoding, int standalone);
 static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg,
 		  bool preserve_whitespace, int encoding);
-static text *xml_xmlnodetoxmltype(xmlNodePtr cur);
+static text *xml_xmlnodetoxmltype(xmlNodePtr cur,
+	   PgXmlErrorContext *xmlerrcxt);
 static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
-	   ArrayBuildState **astate);
+	   ArrayBuildState **astate,
+	   PgXmlErrorContext *xmlerrcxt);
 #endif   /* USE_LIBXML */
 
 static StringInfo query_to_xml_internal(const char *query, char *tablename,
@@ -3590,27 +3592,109 @@ SPI_sql_row_to_xmlelement(int rownum, StringInfo result, char *tablename,
 
 #ifdef USE_LIBXML
 
+/* check ns definition of node and its childrens. If any one of ns is
+ * not defined in node and it's children, but in the node's parent,
+ * copy the definition to node.
+ */
+static void
+xml_checkandcopyns(xmlNodePtr root,
+   PgXmlErrorContext *xmlerrcxt,
+   xmlNodePtr node,
+   xmlNsPtr lastns_before)
+{
+	xmlNsPtr ns = NULL;
+	xmlNsPtr cur_ns;
+	xmlNodePtr cur;
+
+	if (node-ns != NULL)
+	{
+		/* check in new nses */
+		cur_ns = lastns_before == NULL ? node-nsDef : lastns_before-next;
+		while (cur_ns != NULL)
+		{
+			if (cur_ns-href != NULL)
+			{
+if (((cur_ns-prefix == NULL)  (node-ns-prefix == NULL)) ||
+	((cur_ns-prefix != NULL)  (node-ns-prefix != NULL) 
+	 xmlStrEqual(cur_ns-prefix, node-ns-prefix)))
+{
+	ns = cur_ns;
+	break;
+}
+			}
+			cur_ns = cur_ns-next;
+		}
+		if (ns == NULL) /* not in new nses */
+		{
+			ns = xmlSearchNs(NULL, node-parent, node-ns-prefix);
+
+			if (ns != NULL) {
+ns = xmlNewNs(root, ns-href, ns-prefix);
+
+if (ns == NULL  xmlerrcxt-err_occurred) {
+	xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+		could not allocate xmlNs);
+}
+			}
+		}
+	}
+	/* check and copy ns for children recursively */
+	cur = node-children;
+	while (cur != NULL) {
+		xml_checkandcopyns(root, xmlerrcxt, cur, lastns_before);
+		cur = cur-next;
+	}
+}
+
 /*
  * Convert XML node to text (dump subtree in case of element,
  * return value otherwise)
  */
 static text *
-xml_xmlnodetoxmltype(xmlNodePtr cur)
+xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 {
 	xmltype*result;
 
 	if (cur-type == XML_ELEMENT_NODE)
 	{
 		xmlBufferPtr buf;
+		xmlNsPtr lastns_before;
+		xmlNsPtr ns;
+		xmlNsPtr next;
 
 		buf = xmlBufferCreate();
+
 		PG_TRY();
 		{
+			lastns_before = cur-nsDef;
+			if (lastns_before != NULL) {
+while (lastns_before-next != NULL) {
+	lastns_before = lastns_before-next;
+}
+			}
+			xml_checkandcopyns(cur, xmlerrcxt, cur, lastns_before);
+
 			xmlNodeDump(buf, NULL, cur, 0, 1);
 			result = xmlBuffer_to_xmltype(buf);
+
+			/* delete and free new nses */
+			ns = lastns_before == NULL ? cur-nsDef : lastns_before-next;
+			while (ns != NULL) {
+next = ns-next;
+xmlFree(ns);
+ns = next;
+			}
+			if (lastns_before == NULL) {
+cur-nsDef = NULL;
+			} else {
+lastns_before-next = NULL;
+			}
 		}
 		PG_CATCH();
 		{
+			/* new namespaces will be freed while free-ing the node, so we
+			 * won't free it here
+			 */
 			xmlBufferFree(buf);
 			PG_RE_THROW();
 		}
@@ -3656,7 +3740,8 @@ xml_xmlnodetoxmltype(xmlNodePtr cur)
  */
 static int
 xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
-	   ArrayBuildState **astate)
+	   ArrayBuildState 

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-07-11 Thread Ali Akbar
Greetings,

Attached modified patch to handle xmlCopyNode returning NULL. The patch is
larger because xmlerrcxt must be passed to xml_xmlnodetoxmltype (xmlerrcxt
is needed for calling xml_ereport).

From libxml2 source, it turns out that the other cases that xmlCopyNode
will return NULL will not happen. So in this patch i assume that the only
case is memory exhaustion.

But i found some bug in libxml2's code, because we call xmlCopyNode with 1
as the second parameter, internally xmlCopyNode will call xmlStaticCopyNode
recursively via xmlStaticCopyNodeList. And xmlStaticCopyNodeList doesn't
check the return of xmlStaticCopyNode whether it's NULL. So if someday the
recursion returns NULL (maybe because of memory exhaustion), it will
SEGFAULT.

Knowing this but in libxml2 code, is this patch is still acceptable?

Thanks,
Ali Akbar
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 141,149  static bool print_xml_decl(StringInfo buf, const xmlChar *version,
  			   pg_enc encoding, int standalone);
  static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg,
  		  bool preserve_whitespace, int encoding);
! static text *xml_xmlnodetoxmltype(xmlNodePtr cur);
  static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate);
  #endif   /* USE_LIBXML */
  
  static StringInfo query_to_xml_internal(const char *query, char *tablename,
--- 141,151 
  			   pg_enc encoding, int standalone);
  static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg,
  		  bool preserve_whitespace, int encoding);
! static text *xml_xmlnodetoxmltype(xmlNodePtr cur,
! 	   PgXmlErrorContext *xmlerrcxt);
  static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate,
! 	   PgXmlErrorContext *xmlerrcxt);
  #endif   /* USE_LIBXML */
  
  static StringInfo query_to_xml_internal(const char *query, char *tablename,
***
*** 3595,3620  SPI_sql_row_to_xmlelement(int rownum, StringInfo result, char *tablename,
   * return value otherwise)
   */
  static text *
! xml_xmlnodetoxmltype(xmlNodePtr cur)
  {
  	xmltype*result;
  
  	if (cur-type == XML_ELEMENT_NODE)
  	{
  		xmlBufferPtr buf;
  
  		buf = xmlBufferCreate();
  		PG_TRY();
  		{
! 			xmlNodeDump(buf, NULL, cur, 0, 1);
  			result = xmlBuffer_to_xmltype(buf);
  		}
  		PG_CATCH();
  		{
  			xmlBufferFree(buf);
  			PG_RE_THROW();
  		}
  		PG_END_TRY();
  		xmlBufferFree(buf);
  	}
  	else
--- 3597,3636 
   * return value otherwise)
   */
  static text *
! xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
  {
  	xmltype*result;
  
  	if (cur-type == XML_ELEMENT_NODE)
  	{
  		xmlBufferPtr buf;
+ 		xmlNodePtr cur_copy;
  
  		buf = xmlBufferCreate();
+ 
+ 		/* the result of xmlNodeDump won't contain namespace definitions
+ 		 * from parent nodes, but xmlCopyNode duplicates a node along
+ 		 * with its required namespace definitions.
+ 		 */
+ 		cur_copy = xmlCopyNode(cur, 1);
+ 
+ 		if (cur_copy == NULL)
+ 			xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		could not serialize xml);
+ 
  		PG_TRY();
  		{
! 			xmlNodeDump(buf, NULL, cur_copy, 0, 1);
  			result = xmlBuffer_to_xmltype(buf);
  		}
  		PG_CATCH();
  		{
+ 			xmlFreeNode(cur_copy);
  			xmlBufferFree(buf);
  			PG_RE_THROW();
  		}
  		PG_END_TRY();
+ 		xmlFreeNode(cur_copy);
  		xmlBufferFree(buf);
  	}
  	else
***
*** 3656,3662  xml_xmlnodetoxmltype(xmlNodePtr cur)
   */
  static int
  xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate)
  {
  	int			result = 0;
  	Datum		datum;
--- 3672,3679 
   */
  static int
  xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate,
! 	   PgXmlErrorContext *xmlerrcxt)
  {
  	int			result = 0;
  	Datum		datum;
***
*** 3678,3684  xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
  
  	for (i = 0; i  result; i++)
  	{
! 		datum = PointerGetDatum(xml_xmlnodetoxmltype(xpathobj-nodesetval-nodeTab[i]));
  		*astate = accumArrayResult(*astate, datum,
     false, XMLOID,
     CurrentMemoryContext);
--- 3695,3702 
  
  	for (i = 0; i  result; i++)
  	{
! 		datum = PointerGetDatum(xml_xmlnodetoxmltype(xpathobj-nodesetval-nodeTab[i],
! 	 xmlerrcxt));
  		*astate = accumArrayResult(*astate, datum,
     false, XMLOID,
     CurrentMemoryContext);
***
*** 3882,3890  xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
  		 * Extract the results as requested.
  		 */
  		if (res_nitems != NULL)
! 			*res_nitems = xml_xpathobjtoxmlarray(xpathobj, astate);
  		else
! 			(void) xml_xpathobjtoxmlarray(xpathobj, astate);
  	}
  	PG_CATCH();
  	{
--- 3900,3908 
  		 * Extract the results as requested.
  		 */
  		if (res_nitems != NULL)
! 			*res_nitems = 

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-07-08 Thread Ali Akbar
 I don't know enough about XML/XPATH to know if this is a good idea or not,


Actually currently because of the namespace problem, xpath() returns wrong
result (even worse: invalid xml!). So this patch corrects the behavior of
xpath() to the correct one.


 but I did go read the documentation of xmlCopyNode(), and I notice it says
 the second argument is

 extended: if 1 do a recursive copy (properties, namespaces and children
 when applicable) if 2 copy properties and namespaces (when applicable)

 Do we really want it to copy children?  Perhaps the value should be 2?
 (I have no idea, I'm just raising the question.)


If we put 1 as the parameter, the resulting Node will link to the existing
children. In this case, the namespace problem for the children might be
still exists. If any of the children uses different namespace(s) than the
parent, the namespace definition will not be copied correctly.

Just to be safe, in the patch 1 passed as the second argument.

Ideally, we can traverse the Node object and generate XML accordingly, but
it is a lot of work, and a lot of duplicating libxml's code. Maybe we
should push this upstream to libxml?

I also notice that it says

 Returns: a new #xmlNodePtr, or NULL in case of error.

 and the patch is failing to cover the error case.  Most likely, that
 would represent out-of-memory, so it definitely seems to be something
 we should account for.


Ah, overlooked that.

Skimming through libxml2's source, it looks like there are some other cases
beside out-of-memory. Will update the patch according to these cases.

Thanks for the review.

-- 
Ali Akbar


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-07-07 Thread Tom Lane
Abhijit Menon-Sen a...@2ndquadrant.com writes:
 I can confirm that it works fine. I have attached here a very slightly
 tweaked version of the patch (removed trailing whitespace, and changed
 some comment text).

 I'm marking this ready for committer.

I don't know enough about XML/XPATH to know if this is a good idea or not,
but I did go read the documentation of xmlCopyNode(), and I notice it says
the second argument is

extended: if 1 do a recursive copy (properties, namespaces and children
when applicable) if 2 copy properties and namespaces (when applicable)

Do we really want it to copy children?  Perhaps the value should be 2?
(I have no idea, I'm just raising the question.)

I also notice that it says

Returns: a new #xmlNodePtr, or NULL in case of error.

and the patch is failing to cover the error case.  Most likely, that
would represent out-of-memory, so it definitely seems to be something
we should account for.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-06-22 Thread Ali Akbar
2014-06-16 22:52 GMT+07:00 Abhijit Menon-Sen a...@2ndquadrant.com:

 Thanks for the patch, and welcome to Postgres development.



 I can confirm that it works fine. I have attached here a very slightly
 tweaked version of the patch (removed trailing whitespace, and changed
 some comment text).

 I'm marking this ready for committer.


Thanks for the review. Hope i will be able to contribute a little here and
there in the future.

-- 
Ali Akbar