hello,

this note is intended to describe my work on beginning to further
integrate xml into postgresql.  first, i'd like to thank the
contributers of contrib/xml as their work was instrumental in helping
me understand what support exists and where i wanted to go. thanks.

my first requirement is to export data from the database into a format
which can be read not only by existing (postgresql) clients but by
people and programs that don't know how to use a PGresult. xml is very
verbose, but its popularity makes it closer to universal than anything
else of which i could think. in addition, ideas like XSL/XSLT make an
export of xml very attractive to me.

it's been said that converting a PGresult into xml is "trivial" and
that's why it hasn't been done in the codebase as of yet. i have seen
much code that writes xml, and many mistakes are made. most often
improper escaping, or writing to a schema/DTD that has not been
well-thought out. the transformation into xml is not difficult, but it
does require attention to detail.

i feel badly that i have not been able to use any existing
standards. xmlrpc, i found, was not type-rich enough, and that made it
difficult or impossible to use. in particular, the only way to
represent a matrix is as a struct of structs. this makes it very
verbose for one to encode a PGresult. i found SOAP too difficult for
compliance. so my result was to create a schema, which results in a  DTD.

an example of what my code generates can be found below. the following
xml is the result of the query "SELECT 1 as foo 2 as bar":

<?xml version='1.0' encoding='ISO-8859-1'?>
<!DOCTYPE PGresult [
  <!ELEMENT PGresult (col_desc*, row*)>
  <!ATTLIST PGresult
     num_rows CDATA #REQUIRED
     num_cols CDATA #REQUIRED>
  <!ELEMENT col_desc EMPTY>
  <!ATTLIST col_desc 
     num CDATA #REQUIRED 
     format (text | binary) #REQUIRED 
     type CDATA #REQUIRED 
     name CDATA #REQUIRED>
  <!ELEMENT row (col*)>
  <!ATTLIST row
     num CDATA #REQUIRED>
  <!ELEMENT col (#PCDATA)>
  <!ATTLIST col
     num CDATA #REQUIRED>
  <!ENTITY NULL ''>
]>
<PGresult num_rows='1' num_cols='2'>
  <col_desc num='0' type='int4' format='text' name='foo' />
  <col_desc num='1' type='int4' format='text' name='bar' />
  <row num='0'>
    <col num='0'>1</col>
    <col num='1'>2</col>
  </row>
</PGresult>

a slightly more complicated example:
template1=# select oid,typname,typlen,typtype from pg_type where 
oid<20;
 oid | typname | typlen | typtype
-----+---------+--------+---------
  16 | bool    |      1 | b
  17 | bytea   |     -1 | b
  18 | char    |      1 | b
  19 | name    |     32 | b
(4 rows)

<!DOCTYPE PGresult [
  <!ELEMENT PGresult (col_desc*, row*)>
  <!ATTLIST PGresult
     num_rows CDATA #REQUIRED
     num_cols CDATA #REQUIRED>
  <!ELEMENT col_desc EMPTY>
  <!ATTLIST col_desc
     num CDATA #REQUIRED
     format (text | binary) #REQUIRED
     type CDATA #REQUIRED
     name CDATA #REQUIRED>
  <!ELEMENT row (col*)>
  <!ATTLIST row
     num CDATA #REQUIRED>
  <!ELEMENT col (#PCDATA)>
  <!ATTLIST col
     num CDATA #REQUIRED>
  <!ENTITY NULL ''>
]>
<PGresult num_rows='4' num_cols='4'>
  <col_desc num='0' type='oid' format='text' name='oid' />
  <col_desc num='1' type='name' format='text' name='typname' />
  <col_desc num='2' type='int2' format='text' name='typlen' />
  <col_desc num='3' type='char' format='text' name='typtype' />
  <row num='0'>
    <col num='0'>16</col>
    <col num='1'>bool</col>
    <col num='2'>1</col>
    <col num='3'>b</col>
  </row>
  <row num='1'>
    <col num='0'>17</col>
    <col num='1'>bytea</col>
    <col num='2'>-1</col>
    <col num='3'>b</col>
  </row>
  <row num='2'>
    <col num='0'>18</col>
    <col num='1'>char</col>
    <col num='2'>1</col>
    <col num='3'>b</col>
  </row>
  <row num='3'>
    <col num='0'>19</col>
    <col num='1'>name</col>
    <col num='2'>32</col>
    <col num='3'>b</col>
  </row>
</PGresult>

i have done this work for myself and my own needs, so i fully
understand if this work is not interesting to the postgresql group in
general. however, if there is some chance that the changes could be
incorporated into the tree, i would be interested in contributing, as
integration into a proper version of postgresql will make my build
easier. ;)

i would expect that integration would look something like exposing
from libpq a function that looks something like:
  const char *PGresult_as_xml(PGresult *result, int include_dtd);

i would also expect that psql would be modified to take a \X
and to call the above function. there is some strangeness now,
as psql doesn't call methods defined in libpq to print; it has
its own printer. i, of course, would do this work. :) i just
need to know that people are interested.

also, if integration is going to happen, i will need to replace
calls to my hashtables with calls to postgresql's hashtables.
i saw dynamic hashtables in the backend, but not in the interfaces.
i wasn't exactly sure how i should go about introducing another
module to the frontend; there could be problems of which i
remain blissfully unaware.

i look forward to feedback, and i hope this note finds you well,

b

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free web site building tool. Try it!
http://webhosting.yahoo.com/ps/sb/

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Reply via email to