Re: [HACKERS] Native XML

2011-03-10 Thread Robert Haas
On Wed, Mar 9, 2011 at 7:03 PM, Josh Berkus wrote: > Then I think the answer is that we need both data types.  One for > text-XML and one for binary-XML. That's what I think, too. I'm not sure whether we want both of them in core, but I think the binary-XML one would, at a minimum, make an awful

Re: [HACKERS] Native XML

2011-03-09 Thread Josh Berkus
On 3/9/11 10:11 AM, Bruce Momjian wrote: > If you are storing xml in an xml column just to get it >> validated, and doing no processing in the DB, then you'd probably >> prefer our current representation. If you want to build functional >> indexes on xpath expressions, and then run queries that ex

Re: [HACKERS] Native XML

2011-03-09 Thread Anton
On 03/09/2011 08:21 PM, Yeb Havinga wrote: > On 2011-03-09 19:30, Robert Haas wrote: >> On Wed, Mar 9, 2011 at 1:11 PM, Bruce Momjian wrote: >> >>> Robert Haas wrote: >>> On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane wrote: > Well, in principle we could allow them

Re: [HACKERS] Native XML

2011-03-09 Thread Yeb Havinga
On 2011-03-09 19:30, Robert Haas wrote: On Wed, Mar 9, 2011 at 1:11 PM, Bruce Momjian wrote: Robert Haas wrote: On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane wrote: Well, in principle we could allow them to work on both, just the same way that (for instance) "+" is a standardized operator but w

Re: [HACKERS] Native XML

2011-03-09 Thread Robert Haas
On Wed, Mar 9, 2011 at 1:11 PM, Bruce Momjian wrote: > Robert Haas wrote: >> On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane wrote: >> > Well, in principle we could allow them to work on both, just the same >> > way that (for instance) "+" is a standardized operator but works on more >> > than one dat

Re: [HACKERS] Native XML

2011-03-09 Thread Bruce Momjian
Robert Haas wrote: > On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane wrote: > > Well, in principle we could allow them to work on both, just the same > > way that (for instance) "+" is a standardized operator but works on more > > than one datatype. ?But I agree that the prospect of two parallel types

Re: [HACKERS] Native XML

2011-03-02 Thread Nicolas Barbier
2011/3/1 Andrew Dunstan : > I think hierarchical data really only scratches the surface of the problem. > It would be nice to be able to specify all sorts of context for searches: > >   * foo after bar >   * foo near bar >   * foo and bar in the same paragraph >   * foo as a parent/child/ancestor/

Re: [HACKERS] Native XML

2011-03-01 Thread Andrew Dunstan
On 03/01/2011 02:15 PM, Kevin Grittner wrote: Given that there were similar issues for other hierarchical data types, perhaps we need something similar to tsvector, but for hierarchical data. The extra layer of abstraction might not cost much when used for XML compared to the possible benefi

Re: [HACKERS] Native XML

2011-03-01 Thread Tom Lane
"Kevin Grittner" writes: > I apparently didn't express myself very well, since you seem to have > *completely* missed my point. I know we can do tsearch2 searches > against XML, or JSON, or YAML, or (insert next week's new favorite > format here). What we can't currently do efficiently is search

Re: [HACKERS] Native XML

2011-03-01 Thread Kevin Grittner
Andrew Dunstan wrote: > On 02/28/2011 05:28 PM, Kevin Grittner wrote: >> Anton wrote: >> >>> it was actually the focal point of my considerations: whether to >>> store plain text or 'something else'. > > There seems to be an almost universal assumption that storing XML > in its native form (i.e.

Re: [HACKERS] Native XML

2011-03-01 Thread Andrew Dunstan
On 03/01/2011 08:16 AM, Robert Haas wrote: On Mon, Feb 28, 2011 at 6:54 PM, Andrew Dunstan wrote: There seems to be an almost universal assumption that storing XML in its native form (i.e. a text stream) is going to produce inefficient results. Maybe it will, but I think it needs to be fairly

Re: [HACKERS] Native XML

2011-03-01 Thread Robert Haas
On Mon, Feb 28, 2011 at 6:54 PM, Andrew Dunstan wrote: > There seems to be an almost universal assumption that storing XML in its > native form (i.e. a text stream) is going to produce inefficient results. > Maybe it will, but I think it needs to be fairly convincingly demonstrated. > And then we

Re: [HACKERS] Native XML

2011-02-28 Thread Andrew Dunstan
On 02/28/2011 05:28 PM, Kevin Grittner wrote: Anton wrote: it was actually the focal point of my considerations: whether to store plain text or 'something else'. There seems to be an almost universal assumption that storing XML in its native form (i.e. a text stream) is going to produc

Re: [HACKERS] Native XML

2011-02-28 Thread Kevin Grittner
Anton wrote: > it was actually the focal point of my considerations: whether to > store plain text or 'something else'. Given that there were similar issues for other hierarchical data types, perhaps we need something similar to tsvector, but for hierarchical data. The extra layer of abstract

Re: [HACKERS] Native XML

2011-02-28 Thread Anton
On 02/28/2011 05:23 PM, Robert Haas wrote: > On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane wrote: > >> Well, in principle we could allow them to work on both, just the same >> way that (for instance) "+" is a standardized operator but works on more >> than one datatype. But I agree that the prosp

Re: [HACKERS] Native XML

2011-02-28 Thread Andrew Dunstan
On 02/28/2011 10:51 AM, Tom Lane wrote: Andrew Dunstan writes: xpath_table is severely broken by design IMNSHO. We need a new design, but I'm reluctant to work on that until someone does LATERAL, because a replacement would be much nicer to design with it than without it. Well, maybe I'm mis

Re: [HACKERS] Native XML

2011-02-28 Thread Robert Haas
On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane wrote: > Well, in principle we could allow them to work on both, just the same > way that (for instance) "+" is a standardized operator but works on more > than one datatype.  But I agree that the prospect of two parallel types > with essentially duplicat

Re: [HACKERS] Native XML

2011-02-28 Thread Tom Lane
Andrew Dunstan writes: > xpath_table is severely broken by design IMNSHO. We need a new design, > but I'm reluctant to work on that until someone does LATERAL, because a > replacement would be much nicer to design with it than without it. Well, maybe I'm missing something, but I don't really un

Re: [HACKERS] Native XML

2011-02-28 Thread Andrew Dunstan
On 02/28/2011 10:30 AM, Tom Lane wrote: The single most pressing problem we've got with XML right now is the poor state of the XPath extensions in contrib/xml2. If we don't see a meaningful step forward in that area, a new implementation of the xml datatype isn't likely to win acceptance.

Re: [HACKERS] Native XML

2011-02-28 Thread Robert Haas
On Sun, Feb 27, 2011 at 10:20 PM, Andrew Dunstan wrote: > No, I think the xpath implementation is from libxml2. But in any case, I > think the problem is in the whole design of the xpath_table function, and > not in the library used for running the xpath queries. i.e it's our fault, > and not the

Re: [HACKERS] Native XML

2011-02-28 Thread Tom Lane
Andrew Dunstan writes: > On 02/28/2011 04:25 AM, Anton wrote: >> A question is of course, if potential new implementation must >> necessarily replace the existing one, immediately or at all. What I >> published is implemented as a new data type and thus pg_type.h and >> pg_proc.h are the only file

Re: [HACKERS] Native XML

2011-02-28 Thread Andrew Dunstan
On 02/28/2011 04:25 AM, Anton wrote: On 02/27/2011 11:57 PM, Peter Eisentraut wrote: On sön, 2011-02-27 at 10:45 -0500, Tom Lane wrote: Hmm, so this doesn't rely on libxml2 at all? Given the amount of pain that library has caused us, getting out from under it seems like a mighty attractive

Re: [HACKERS] Native XML

2011-02-28 Thread Anton
On 02/27/2011 11:57 PM, Peter Eisentraut wrote: > On sön, 2011-02-27 at 10:45 -0500, Tom Lane wrote: > >> Hmm, so this doesn't rely on libxml2 at all? Given the amount of pain >> that library has caused us, getting out from under it seems like a >> mighty attractive idea. >> > This doesn't

Re: [HACKERS] Native XML

2011-02-27 Thread Andrew Dunstan
On 02/27/2011 10:07 PM, Tom Lane wrote: Andrew Dunstan writes: On 02/27/2011 03:06 PM, Tom Lane wrote: The case that I don't think we have any idea how to solve is http://archives.postgresql.org/pgsql-hackers/2010-02/msg02424.php I'd forgotten about this. But as ugly as it is, I don't think

Re: [HACKERS] Native XML

2011-02-27 Thread Tom Lane
Andrew Dunstan writes: > On 02/27/2011 03:06 PM, Tom Lane wrote: >> The case that I don't think we have any idea how to solve is >> http://archives.postgresql.org/pgsql-hackers/2010-02/msg02424.php > I'd forgotten about this. But as ugly as it is, I don't think it's > libxml2's fault. Well, str

Re: [HACKERS] Native XML

2011-02-27 Thread Andrew Dunstan
On 02/27/2011 03:06 PM, Tom Lane wrote: Mike Fowler writes: I don't believe that XPath is "fundamentally broken", but I think Tom may have meant xslt. When reviewing a recent patch to xml2/xslt I found a few bugs in the way were using libxslt, as well as a bug in the library itself (see http:

Re: [HACKERS] Native XML

2011-02-27 Thread Peter Eisentraut
On sön, 2011-02-27 at 10:45 -0500, Tom Lane wrote: > Hmm, so this doesn't rely on libxml2 at all? Given the amount of pain > that library has caused us, getting out from under it seems like a > mighty attractive idea. This doesn't replace the existing xml functionality, so it won't help getting r

Fwd: Re: [HACKERS] Native XML

2011-02-27 Thread Anton
Sorry for resending, I forgot to add 'pgsql-hackers' to CC. Original Message Subject: Re: [HACKERS] Native XML Date: Sun, 27 Feb 2011 23:18:03 +0100 From: Anton To: Tom Lane On 02/27/2011 04:45 PM, Tom Lane wrote: > Anton writes: > >

Re: [HACKERS] Native XML

2011-02-27 Thread Tom Lane
Mike Fowler writes: > I don't believe that XPath is "fundamentally broken", but I think Tom > may have meant xslt. When reviewing a recent patch to xml2/xslt I found > a few bugs in the way were using libxslt, as well as a bug in the > library itself (see > http://archives.postgresql.org/pgsql

Re: [HACKERS] Native XML

2011-02-27 Thread David E. Wheeler
On Feb 27, 2011, at 11:43 AM, Tom Lane wrote: >> XPath is broken? I use it heavily in the Perl module Test::XPath and now, in >> PostgreSQL, with my explanation extension. > > Well, if you're only using cases that work, you don't need to worry. Okay then. David -- Sent via pgsql-hackers mai

Re: [HACKERS] Native XML

2011-02-27 Thread Mike Fowler
On 27/02/11 19:37, David E. Wheeler wrote: On Feb 27, 2011, at 11:23 AM, Tom Lane wrote: Well, that's why I asked --- if it's going to be a huge chunk of code, then I agree this is the wrong path to pursue. However, I do feel that libxml pretty well sucks, so if we could replace it with a rela

Re: [HACKERS] Native XML

2011-02-27 Thread Tom Lane
"David E. Wheeler" writes: > On Feb 27, 2011, at 11:23 AM, Tom Lane wrote: >> No, because the xpath stuff is fundamentally broken, and nobody seems to >> know how to make libxslt do what we actually need. See the open bugs >> on the TODO list. > XPath is broken? I use it heavily in the Perl modu

Re: [HACKERS] Native XML

2011-02-27 Thread David E. Wheeler
On Feb 27, 2011, at 11:23 AM, Tom Lane wrote: > Well, that's why I asked --- if it's going to be a huge chunk of code, > then I agree this is the wrong path to pursue. However, I do feel that > libxml pretty well sucks, so if we could replace it with a relatively > small amount of code, that migh

Re: [HACKERS] Native XML

2011-02-27 Thread Tom Lane
Andrew Dunstan writes: > On 02/27/2011 10:45 AM, Tom Lane wrote: >> Hmm, so this doesn't rely on libxml2 at all? Given the amount of pain >> that library has caused us, getting out from under it seems like a >> mighty attractive idea. How big a chunk of code do you think it'd be >> by the time y

Re: [HACKERS] Native XML

2011-02-27 Thread Andrew Dunstan
On 02/27/2011 10:45 AM, Tom Lane wrote: Anton writes: I've been playing with 'native XML' for a while and now wondering if further development of such a feature makes sense for Postgres. ... Unlike 'libxml2', the parser uses palloc()/pfree(). The output format is independent from any 3rd part

Re: [HACKERS] Native XML

2011-02-27 Thread Tom Lane
Anton writes: > I've been playing with 'native XML' for a while and now wondering if > further development of such a feature makes sense for Postgres. > ... > Unlike 'libxml2', the parser uses palloc()/pfree(). The output format is > independent from any 3rd party code. Hmm, so this doesn't rely

Re: [HACKERS] Native XML

2011-02-26 Thread Josh Berkus
On 2/26/11 3:40 PM, Anton wrote: > I've been playing with 'native XML' for a while and now wondering if > further development of such a feature makes sense for Postgres. > (By not having brought this up earlier I'm taking the chance that the > effort will be wasted, but that's not something you sho

[HACKERS] Native XML

2011-02-26 Thread Anton
Hello, I've been playing with 'native XML' for a while and now wondering if further development of such a feature makes sense for Postgres. (By not having brought this up earlier I'm taking the chance that the effort will be wasted, but that's not something you should worry about.) The code is ava