On 03/01/2011 08:16 AM, Robert Haas wrote:
On Mon, Feb 28, 2011 at 6:54 PM, Andrew Dunstan<and...@dunslane.net>  wrote:
There seems to be an almost universal assumption that storing XML in its
native form (i.e. a text stream) is going to produce inefficient results.
Maybe it will, but I think it needs to be fairly convincingly demonstrated.
And then we would have to consider the costs. For example, unless we
implemented our own XPath processor to work with our own XML format (do we
really want to do that?), to evaluate an XPath expression for a piece of XML
we'd actually need to produce the text format from our internal format
before passing it to some external library to parse into its internal format
and then process the XPath expression. That means we'd actually be making
things worse, not better. But this is clearly the sort of processing people
want to do - see today's discussion upthread about xpath_table.
Well, obviously the only point of having our own internal format is if
we have our own xpath processor&c to match.  One would think that
this would be a lot faster than parsing the string with libxml2 every
time we want to xpath it, especially for large documents.  But then
again, I haven't seen any benchmarks.


That would be a huge body of code we'd need to maintain, complex and full of subtleties which, if we weren't deeply invested in the XML standards would bite us, I have no doubt.

Now, if someone wanted to start a project that added efficient serialization/de-serialization of libxml2 (or other library) objects so we could avoid constant parsing overhead, that would make lots more sense to me.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to