Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-20 Thread Joel Jacobson
On Fri, Apr 12, 2013 at 1:07 PM, Hannu Krosing ha...@2ndquadrant.comwrote:

 I was just thinking of moving the queries the pg_dump currently
 uses into UDF-s, which do _not_ use catalog cache, but will use
 the same SQL to query catalogs as pg_dump currently does
 using whatever snapshot mode is currently set .

 the pg_dump will need to still have the same queries for older
 versions of postgresql but for new versions pg_dump  can become
 catalog-agnostic.

 and I think that we can retire pg_dump support for older
 postgresql versions the same way we drop support for
 older versions of postgresql itself.



 main things I see would be

  * get_list_of_objects(object_**type, pattern or namelist)
  * get_sql_def_for_object(object_**type, object_name)
  * sort_by_dependency(list of [obj_type, obj_name])

 from this you could easily construct most uses, especially if
 sort_by_dependency(list of [obj_type, obj_name])
 would be smart enough to break circular dependencies, like
 turning to tables with mutual FK-s into tabledefs without
 FKs + separate constraints.


+1

This is an excellent idea. This would allow doing all kinds of crazy things
outside of the scope of pg_dump.

2 years ago I was working on a system to version control the schema, inside
the database.
Don't know if it's a good idea or not, but one thing which bugged me a lot
was the lack of pg_get_[object type]def(oid) functions for all different
object types.
It also turned out to be quite complicated to do the pg_depend topological
sort yourself. I managed eventually, but it was running to slow because I
had to pass the entire content of pg_depend to a plperl function I wrote.

With this in place I would be motivated enough to resume my old project,
which is still online at https://github.com/gluefinance/pov if anyone is
interested.

Is it really necessary to write all the missing pg_get_[object
type]def(oid) functions in C? I think it would be quite easy to put them
together using pure SQL, you wouldn't even need PL/pgSQL.

This old view I once wrote manage to produce working create and drop
statements for most object types using SQL only:
https://github.com/gluefinance/pov/blob/master/sql/schema/pov/views/pg_depend_definitions.sql

It would also be nice with functions which returned the proper command to
DROP an object. I need it in this project in order to do schema
modifications where objects have to be dropped/recreated in a particular
order to not break dependencies. Perhaps there are other use cases out
there.


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-12 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 This idea doesn't work because of back-patch considerations (i.e. we
 would not be able to create the functions in back branches, and so this
 new style of pg_dump would only work with future server versions).  So

That is a policy question, not a technical one.

We could either add the new functions in the backend binary itself, or
provide it as an extension that pg_dump would know to install when
needed, if we decided it's ok.

My understanding is that will need to change that policy anyways the day
we have a page disk format change and pg_upgrade needs to flag the old
cluster pages with the old page version number before being able to run,
or something.

 There are other issues too, in particular that most of the backend's
 code tends to work on SnapshotNow time whereas pg_dump would really
 prefer it was all done according to the transaction snapshot.  We have

Would that be solved by having MVCC catalogs, or the backend code you're
talking about wouldn't be included in there? (which would be surprising
to me, as much as trumping the benefits of MVCC catalogs, but well).

 In any case, push it to the backend offers no detectable help with the
 core design issue here, which is figuring out what functionality needs
 to be exposed with what API.

Andrew did begin to work on that parts with the Retail DDL project. We
know of several getddl implementation, and you can also have a look at
the pg_dump -Fs (split format) patch that didn't make it for 9.3, where
some API work has been done.

The need exists and some thinking over the API to get here did happen.
Some more certainly needs to be done, granted.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-12 Thread Dimitri Fontaine
Michael Paquier michael.paqu...@gmail.com writes:
 I recall discussions about reverse engineering of a parsed query tree in
 the event trigger threads but nothing has been committed I think. Also, you

Yes. The name used in there was Normalized Command String.

 need to consider that implementing such reverse engineering mechanism in
 core might not be a good thing for new features and maintenance, as it
 would mean that it is necessary to change those APIs consistently with what
 is added on the parsing side.

The goal is to retain the same API, which is quite simple:

  function get_command_string(Node *parsetree) returns text

At the SQL level, the Node * is of datatype internal and you can't
forge it, you need to be given it in some ways. In the Event Trigger
case we though of a TG_PARSETREE magic variable, or maybe another
function get_current_parsetree() that only work when called from an
event trigger.

The other part of the API of course is how to represent the data, and as
we're talking about a Normalized Command String, there's no choice but
issuing a valid SQL command string that the server would know how to
execute and which would have the same side effects.

So of course a 9.3 and a 9.4 server equiped with that hypothetical
function would behave differently when the syntax did change. And that's
exactly why I think it the best choice here is to have that code
embedded and maintained in core.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-12 Thread Shuai Fan

On 04/11/2013 11:48 PM, Andrew Dunstan wrote:
It could be interesting to have a library that would output database 
metadata in some machine readable and manipulatable format such as 
JSON or XML. One thing that's annoying about the text output pg_dump 
produces is that it's not at all structured, so if you want, for 
example, to restore a table but to a table of a different name, or to 
a different schema, then you're reduced to having to mangle the SQL by 
using hand editing or regular expression matching. Something with the 
potential to ease that pain would be worth having.


Yes. This is really interesting. Current code in pg_dump, supports 4 
ArchiveFormat, e.g. archCustom, archTar, archNull and archDirectory. 
These formats are implementation of interface pg_backup. Maybe I could 
try to add two implementation of XML and JSON.
It is worth to mention that I wrote a program to parse XML format 
file into csv one using library libxercesc a month ago, Although, this 
program is just like helloworld. But, maybe I could get benefit from 
that small program, because both of them use XML format. And what I need 
to do is try another xml library.
I had a look at JSON on wiki. The format is a little like XML. Both 
of them are nested. And there are some library could be used, e.g. 
libjson (or json.c, or other json library writting in C) and libxml2 (or 
something else).


BTW, could it be an idea for GSOC? If so, I can have a try. Add XML 
and JSON output format for pg_dump.


Thank you all for your attention.


Best regards,
 Shuai





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-12 Thread Hannu Krosing

On 04/11/2013 12:17 AM, Tom Lane wrote:

Alvaro Herrera alvhe...@2ndquadrant.com writes:

Hannu Krosing wrote:

Natural solution to this seems to move most of pg_dump functionality
into backend as functions, so we have pg_dump_xxx() for everything
we want to dump plus a topological sort function for getting the
objects in right order.

This idea doesn't work because of back-patch considerations (i.e. we
would not be able to create the functions in back branches, and so this
new style of pg_dump would only work with future server versions).  So
pg_dump itself would have to retain capability to dump stuff from old
servers.  This seems unlikely to fly at all, because we'd be then
effectively maintaining pg_dump in two places, both backend and the
pg_dump source code.

There are other issues too, in particular that most of the backend's
code tends to work on SnapshotNow time whereas pg_dump would really
prefer it was all done according to the transaction snapshot.

I was just thinking of moving the queries the pg_dump currently
uses into UDF-s, which do _not_ use catalog cache, but will use
the same SQL to query catalogs as pg_dump currently does
using whatever snapshot mode is currently set .

the pg_dump will need to still have the same queries for older
versions of postgresql but for new versions pg_dump  can become
catalog-agnostic.

and I think that we can retire pg_dump support for older
postgresql versions the same way we drop support for
older versions of postgresql itself.

Hannu


We have
got bugs of that ilk already in pg_dump, but we shouldn't introduce a
bunch more.  Doing this right would therefore mean that we'd have to
write a lot of duplicative code in the backend, ie, it's not clear that
we gain any synergy by pushing the functionality over.  It might
simplify cross-backend-version issues (at least for backend versions
released after we'd rewritten all that code) but otherwise I'm afraid
it'd just be pushing the problems somewhere else.

In any case, push it to the backend offers no detectable help with the
core design issue here, which is figuring out what functionality needs
to be exposed with what API.

main things I see would be

 * get_list_of_objects(object_type, pattern or namelist)
 * get_sql_def_for_object(object_type, object_name)
 * sort_by_dependency(list of [obj_type, obj_name])

from this you could easily construct most uses, especially if
sort_by_dependency(list of [obj_type, obj_name])
would be smart enough to break circular dependencies, like
turning to tables with mutual FK-s into tabledefs without
FKs + separate constraints.

Or we could always have constraints separately, so that
the ones depending on non-exported objects would be easy
to leave out

My be the dependency API analysis itself is something
worth a GSOC effort ?

Hannu


regards, tom lane




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-11 Thread Tom Lane
Pavel Golub pa...@microolap.com writes:
 From my point of view the new library should export only two
 functions:

 1. The execution function:

 ExecStatusType PGdumpdbParams(const char * const *keywords,
  const char * const *values);

No, this is exactly *wrong*.  You might as well not bother to refactor,
if the only API the library presents is exactly equivalent to what you
could get with system(pg_dump ...).

I don't know what the right answer is, but this isn't it.  Most people
who are interested in this topic are interested because they want to get
output that is different from anything pg_dump would produce on its own,
for instance applying a more complex object-selection rule than anything
pg_dump offers.  Right now, the only way they can do that is lobby to
add new switch options to pg_dump.  With a change like this, it'd still
be the case that they can't get what they want except by adding new
switch options to pg_dump.  I don't see any advantage gained.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-11 Thread Pavel Golub
Hello, Tom.

You wrote:

TL Pavel Golub pa...@microolap.com writes:
 From my point of view the new library should export only two
 functions:

 1. The execution function:

 ExecStatusType PGdumpdbParams(const char * const *keywords,
  const char * const *values);

TL No, this is exactly *wrong*.  You might as well not bother to refactor,
TL if the only API the library presents is exactly equivalent to what you
TL could get with system(pg_dump ...).

Well, yes. You're absolutely right. But should this be a starting
point?

TL I don't know what the right answer is, but this isn't it.  Most people
TL who are interested in this topic are interested because they want to get
TL output that is different from anything pg_dump would produce on its own,
TL for instance applying a more complex object-selection rule than anything
TL pg_dump offers.  Right now, the only way they can do that is lobby to
TL add new switch options to pg_dump.  With a change like this, it'd still
TL be the case that they can't get what they want except by adding new
TL switch options to pg_dump.  I don't see any advantage gained.

TL regards, tom lane



-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-11 Thread Andrew Dunstan


On 04/11/2013 09:51 AM, Tom Lane wrote:

Pavel Golub pa...@microolap.com writes:

 From my point of view the new library should export only two
functions:
1. The execution function:
ExecStatusType PGdumpdbParams(const char * const *keywords,
  const char * const *values);

No, this is exactly *wrong*.  You might as well not bother to refactor,
if the only API the library presents is exactly equivalent to what you
could get with system(pg_dump ...).

I don't know what the right answer is, but this isn't it.  Most people
who are interested in this topic are interested because they want to get
output that is different from anything pg_dump would produce on its own,
for instance applying a more complex object-selection rule than anything
pg_dump offers.  Right now, the only way they can do that is lobby to
add new switch options to pg_dump.  With a change like this, it'd still
be the case that they can't get what they want except by adding new
switch options to pg_dump.  I don't see any advantage gained.






Well, either they want that or they want that output more accessibly, 
and without all the baggage that pg_dump necessarily brings to the 
table. pg_dump does a lot of stuff that's basically designed for bulk 
operations, and often what people want is a way to get, say, the 
creation DDL for some object, without any locks than the usual locks any 
transaction takes. Last year I started writing a package to provide such 
functions, which i called RetailDDL, and it was well received at the 
conference where I talked about it, but I have not had time since then 
to work on it, as JSON development among other things has had a rather 
higher priority. But I think it's very well worth doing. I think in 
principle having database functions for the creation DDL for its own 
objects is a good idea.


And yes, that would mean keeping knowledge of how to produce such output 
in two places - pg_dump is going to need to keep historical knowledge, 
for one thing. But I can live with that.


It could be interesting to have a library that would output database 
metadata in some machine readable and manipulatable format such as JSON 
or XML. One thing that's annoying about the text output pg_dump produces 
is that it's not at all structured, so if you want, for example, to 
restore a table but to a table of a different name, or to a different 
schema, then you're reduced to having to mangle the SQL by using hand 
editing or regular expression matching. Something with the potential to 
ease that pain would be worth having.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-11 Thread Stephen Frost
* Andrew Dunstan (and...@dunslane.net) wrote:
 On 04/11/2013 09:51 AM, Tom Lane wrote:
 No, this is exactly *wrong*.  You might as well not bother to refactor,
 if the only API the library presents is exactly equivalent to what you
 could get with system(pg_dump ...).

Agreed.

 Well, either they want that or they want that output more
 accessibly, and without all the baggage that pg_dump necessarily
 brings to the table. pg_dump does a lot of stuff that's basically
 designed for bulk operations, and often what people want is a way to
 get, say, the creation DDL for some object, without any locks than
 the usual locks any transaction takes. 

Yes- being able to get that from a simple database function would be
very nice.  I wonder if some of what's been done with the event
triggers would inform us about what that API should look like.

 And yes, that would mean keeping knowledge of how to produce such
 output in two places - pg_dump is going to need to keep historical
 knowledge, for one thing. But I can live with that.

Agreed.  If it doesn't live in two places, for some period, we'll never
actually have it outside of pg_dump.

 It could be interesting to have a library that would output database
 metadata in some machine readable and manipulatable format such as
 JSON or XML.

The DB function to produce the DDL command might have options to produce
it in multiple formats..?  Seems like it'd be nice to have, though
perhaps not initially.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-11 Thread Michael Paquier
On Fri, Apr 12, 2013 at 1:00 AM, Stephen Frost sfr...@snowman.net wrote:

  Well, either they want that or they want that output more
  accessibly, and without all the baggage that pg_dump necessarily
  brings to the table. pg_dump does a lot of stuff that's basically
  designed for bulk operations, and often what people want is a way to
  get, say, the creation DDL for some object, without any locks than
  the usual locks any transaction takes.

 Yes- being able to get that from a simple database function would be
 very nice.  I wonder if some of what's been done with the event
 triggers would inform us about what that API should look like.

I recall discussions about reverse engineering of a parsed query tree in
the event trigger threads but nothing has been committed I think. Also, you
need to consider that implementing such reverse engineering mechanism in
core might not be a good thing for new features and maintenance, as it
would mean that it is necessary to change those APIs consistently with what
is added on the parsing side.
It could make more sense to have such a set of functions created as a
separate project.

My 2c.
-- 
Michael


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-11 Thread Tatsuo Ishii
  Well, either they want that or they want that output more
  accessibly, and without all the baggage that pg_dump necessarily
  brings to the table. pg_dump does a lot of stuff that's basically
  designed for bulk operations, and often what people want is a way to
  get, say, the creation DDL for some object, without any locks than
  the usual locks any transaction takes.

 Yes- being able to get that from a simple database function would be
 very nice.  I wonder if some of what's been done with the event
 triggers would inform us about what that API should look like.

 I recall discussions about reverse engineering of a parsed query tree in
 the event trigger threads but nothing has been committed I think. Also, you
 need to consider that implementing such reverse engineering mechanism in
 core might not be a good thing for new features and maintenance, as it
 would mean that it is necessary to change those APIs consistently with what
 is added on the parsing side.
 It could make more sense to have such a set of functions created as a
 separate project.

This may or may not related to, but...

pgpool-II already does reverse engineering from a parsed query. It
parses a query, genetrates raw parse tree, rewrites it for certain
purpose and generates text query. If you are interested, you could
take a look at pgpool-II source code.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-10 Thread Peter Eisentraut
On 4/10/13 10:54 AM, ˧ wrote:
 I'm interested in the idea Rewrite (add) pg_dump and pg_restore
 utilities as libraries (.so, .dll  .dylib).

The pg_dump code is a giant mess, and refactoring it as a library is
perhaps not a project for a new hacker.

Independent of that, I think the first consideration in such a project
would be, who else would be using that library?  What are the use cases?
 And then come up with an interface around that, and then see about
refactoring pg_dump.

I think the main uses cases mentioned in connection with this idea are
usually in the direction of finer-grained control over what gets dumped
and how.  But making pg_dump into a library would not necessarily
address that.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-10 Thread Alvaro Herrera
Peter Eisentraut wrote:

 I think the main uses cases mentioned in connection with this idea are
 usually in the direction of finer-grained control over what gets dumped
 and how.  But making pg_dump into a library would not necessarily
 address that.

There's also the matter of embedding pg_dump into other programs.  For
example, calling the pg_dump executable from inside pgAdmin is a rather
messy solution to the problem.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-10 Thread Hannu Krosing

On 04/10/2013 11:02 PM, Alvaro Herrera wrote:

Peter Eisentraut wrote:


I think the main uses cases mentioned in connection with this idea are
usually in the direction of finer-grained control over what gets dumped
and how.  But making pg_dump into a library would not necessarily
address that.

There's also the matter of embedding pg_dump into other programs.  For
example, calling the pg_dump executable from inside pgAdmin is a rather
messy solution to the problem.


Natural solution to this seems to move most of pg_dump functionality
into backend as functions, so we have pg_dump_xxx() for everything
we want to dump plus a topological sort function for getting the
objects in right order.

The main things left into pg_dump the library would be support various ways
to format the dump results into text, tar and dump files.


Hannu


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-10 Thread Alvaro Herrera
Hannu Krosing wrote:
 On 04/10/2013 11:02 PM, Alvaro Herrera wrote:
 Peter Eisentraut wrote:
 
 I think the main uses cases mentioned in connection with this idea are
 usually in the direction of finer-grained control over what gets dumped
 and how.  But making pg_dump into a library would not necessarily
 address that.
 There's also the matter of embedding pg_dump into other programs.  For
 example, calling the pg_dump executable from inside pgAdmin is a rather
 messy solution to the problem.

 Natural solution to this seems to move most of pg_dump functionality
 into backend as functions, so we have pg_dump_xxx() for everything
 we want to dump plus a topological sort function for getting the
 objects in right order.

This idea doesn't work because of back-patch considerations (i.e. we
would not be able to create the functions in back branches, and so this
new style of pg_dump would only work with future server versions).  So
pg_dump itself would have to retain capability to dump stuff from old
servers.  This seems unlikely to fly at all, because we'd be then
effectively maintaining pg_dump in two places, both backend and the
pg_dump source code.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-10 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Hannu Krosing wrote:
 Natural solution to this seems to move most of pg_dump functionality
 into backend as functions, so we have pg_dump_xxx() for everything
 we want to dump plus a topological sort function for getting the
 objects in right order.

 This idea doesn't work because of back-patch considerations (i.e. we
 would not be able to create the functions in back branches, and so this
 new style of pg_dump would only work with future server versions).  So
 pg_dump itself would have to retain capability to dump stuff from old
 servers.  This seems unlikely to fly at all, because we'd be then
 effectively maintaining pg_dump in two places, both backend and the
 pg_dump source code.

There are other issues too, in particular that most of the backend's
code tends to work on SnapshotNow time whereas pg_dump would really
prefer it was all done according to the transaction snapshot.  We have
got bugs of that ilk already in pg_dump, but we shouldn't introduce a
bunch more.  Doing this right would therefore mean that we'd have to
write a lot of duplicative code in the backend, ie, it's not clear that
we gain any synergy by pushing the functionality over.  It might
simplify cross-backend-version issues (at least for backend versions
released after we'd rewritten all that code) but otherwise I'm afraid
it'd just be pushing the problems somewhere else.

In any case, push it to the backend offers no detectable help with the
core design issue here, which is figuring out what functionality needs
to be exposed with what API.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSOC] questions about idea rewrite pg_dump as library

2013-04-10 Thread Pavel Golub
Hello, 帅.

You wrote:

帅 Hi all, 
帅 I'd like to introduce myself to the dev community. I am Shuai
帅 Fan, a student from Dalian University of Technology, DLUT , for
帅 short, China. And I am interested in working with PostgreSQL project in 
GSOC2013.
帅 I'm interested in the idea Rewrite (add) pg_dump and
帅 pg_restore utilities as libraries (.so, .dll  .dylib). 

帅 These days, I had a talk with Mr. Pavel Golub by email, the
帅 author of this post. And asked some questions about this idea. He
帅 adviced me to post the question to this mail list. 

帅 My question is:
帅 There are lots of functions in pg_dump.c. If I rewrite this
帅 file as library. I should split pg_dump.c into two or more
帅 files(Mr. Pavel Golub's advice). However, some functions, do have return 
value. e.g.

帅 static DumpableObject *createBoundaryObjects(void);

帅 I thought, these functions must get return value through
帅 function argument by passing pointer to it, when using library.
帅 But, the question is: If I did so, function prototype may be
帅 changed. And Mr. Pavel Golub said it's better to keep all
帅 functions with the same declaration, and so we will only have one
帅 code base for both console application and library. I think he is right.
帅 But then, how can I deal with this situation? From my point of
帅 view, I can't malloc a variable (e.g. DumpableObject) in one
帅 library (e.g. pg_dumplib.so), and then return it's pointer to
帅 another library (or excutable program). Maybe, it's not safe(?). Or
帅 has something to do with memory leak(?). I'm not sure.
帅 
帅 Do you have any suggestions?

From my point of view the new library should export only two
functions:

1. The execution function:

ExecStatusType PGdumpdbParams(const char * const *keywords,
 const char * const *values);

Return type may be other, but ExecStatusType seems to be OK for this
purpose: PGRES_TUPLES_OK - for success, PGRES_FATAL_ERROR - for fail.

Parameters will remain the same as usual command-line options for
pg_dump. Thus we will have less work for existing application, e.g.
pgAdmin.

2. Logging or notice processing function:

typedef void (*PGdumplogProcessor) (char *relname, int done, char *message);

PGdumplogProcessor PQsetNoticeProcessor(PGdumplogProcessor proc,
 void *arg);

The purpose of this function is process output of the dump. The first
argument is callback-function which accepts information about current
relname (or operation, or stage), done indicates how much work done
(for progress bars etc.), message contains some extra information.

That's all! Only two functions. Indeed we don't need all those
low-level dump functions like createBoundaryObjects etc. There will be
the only one entry to the whole logic. And if one wants the only one
table, funcction or schema - combination of correct parameters should
be passed to PGdumpdbParams.

帅 Best wishes,
帅 Shuai




-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers