Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore
On Fri, Mar 21, 2014 at 4:09 AM, Tom Lane t...@sss.pgh.pa.us wrote: Craig Ringer cr...@2ndquadrant.com writes: Here's how I think it needs to look: [ move all the functionality to the backend ] Of course, after you've done all that work, you've got something that is of exactly zero use to its supposed principal use-case, pg_dump. pg_dump will still have to support server versions that predate all these fancy new dump functions, and that pretty much ensures that most of pg_dump's core functionality will still be on the client side. Or, if you try to finesse that problem by making sure the new server APIs correspond to easily-identified pieces of pg_dump code, you'll probably end up with APIs that nobody else wants to use :-(. Or you should mandate that new server versions should be able to consume _old_ pg_dump version output. This would change the recommended when upgrading, dump using the new pg_dump to when upgrading, dump using the old pg_dump. This would be necessary policy going forward anyway, if most of the pg_dump functionality was server-side, because it would be generating dumps in the server-version dump format, not the client-version format. 'Regards Marcin Mańk (goes back to lurker cave...)
Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore
On Thu, Mar 20, 2014 at 11:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Craig Ringer cr...@2ndquadrant.com writes: Here's how I think it needs to look: [ move all the functionality to the backend ] Of course, after you've done all that work, you've got something that is of exactly zero use to its supposed principal use-case, pg_dump. pg_dump will still have to support server versions that predate all these fancy new dump functions, and that pretty much ensures that most of pg_dump's core functionality will still be on the client side. Or, if you try to finesse that problem by making sure the new server APIs correspond to easily-identified pieces of pg_dump code, you'll probably end up with APIs that nobody else wants to use :-(. It's worse than that. If you put all the logic in the server, then a dump taken on an older version won't be able to quote keywords added in the newer version. Go directly to fail. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore
On 03/21/2014 09:38 AM, Robert Haas wrote: On Thu, Mar 20, 2014 at 11:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Craig Ringer cr...@2ndquadrant.com writes: Here's how I think it needs to look: [ move all the functionality to the backend ] Of course, after you've done all that work, you've got something that is of exactly zero use to its supposed principal use-case, pg_dump. pg_dump will still have to support server versions that predate all these fancy new dump functions, and that pretty much ensures that most of pg_dump's core functionality will still be on the client side. Or, if you try to finesse that problem by making sure the new server APIs correspond to easily-identified pieces of pg_dump code, you'll probably end up with APIs that nobody else wants to use :-(. It's worse than that. If you put all the logic in the server, then a dump taken on an older version won't be able to quote keywords added in the newer version. Go directly to fail. Yeah. This tantalizing project has been looked at several times and found to be a viper's nest. What would be useful for many purposes, and is a long-standing project of mine that I still haven't found time to make progress on, is that the server should contain functions to produce the creation SQL for all its own objects, free of the locks that pg_dump requires for consistency. That would be a great SoC project, incidentally. I'd even volunteer to mentor that one. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore
Andrew Dunstan escribió: What would be useful for many purposes, and is a long-standing project of mine that I still haven't found time to make progress on, is that the server should contain functions to produce the creation SQL for all its own objects, free of the locks that pg_dump requires for consistency. Maybe you'd like my DDL deparse project, then. Right now it's only of use for event triggers during DDL execution, but I don't see any strong reason it cannot be used to reconstruct object creation commands from only their identifying OID. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore
On Tue, Mar 18, 2014 at 8:41 PM, Alexandr askel...@gmail.com wrote: Rewrite (add) pg_dump and pg_restore utilities as libraries (.so, .dll .dylib) This strikes me as (1) pretty vague and (2) probably too hard for a summer project. I mean, getting the existing binaries to build libraries that you can call with some trivial interface that mimics the existing command-line functionality of pg_dump might be doable, but that's not all that interesting. What people are really going to want is a library with a sophisticated API that lets you do interesting things programmatically. But that's going to be hard. AFAIK, nobody's even tried to figure out what that API should look like. Even if we had that worked out, a non-trivial task, the pg_dump source code is a mess, so refactoring it to provide such an API is likely to be a job and a half. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore
On 03/21/2014 09:28 AM, Robert Haas wrote: On Tue, Mar 18, 2014 at 8:41 PM, Alexandr askel...@gmail.com wrote: Rewrite (add) pg_dump and pg_restore utilities as libraries (.so, .dll .dylib) This strikes me as (1) pretty vague and (2) probably too hard for a summer project. I mean, getting the existing binaries to build libraries that you can call with some trivial interface that mimics the existing command-line functionality of pg_dump might be doable, but that's not all that interesting. What people are really going to want is a library with a sophisticated API that lets you do interesting things programmatically. But that's going to be hard. AFAIK, nobody's even tried to figure out what that API should look like. Even if we had that worked out, a non-trivial task, the pg_dump source code is a mess, so refactoring it to provide such an API is likely to be a job and a half. ... and still wouldn't solve one of the most frequently requested things for pg_dump / pg_restore, which is the ability to use them *server-side* over a regular PostgreSQL connection. It'd be useful progress toward that, though. Right now, we can't even get the PostgreSQL server to emit DDL for a table, let alone do anything more sophisticated. Here's how I think it needs to look: - Design a useful API for pg_dump and pg_restore that is practical to use for pg_dump and pg_restore's current tasks (fast database dump/restore) and also useful for extracting specific objects from the database. When designing, consider that we'll want to expose this API or functions that use it over SQL later. - Create a new libpqdump library. - Implement the designed API in the new library, moving and adjusting code from pg_dump / pg_restore where possible, writing new code where not. - Refactor (closer to rewrite) pg_dump and pg_restore to use libpqdump, removing as much knowledge of the system catalogs etc as possible from them. - Make sure the result still performs OK THEN, once that's settled in: - Modify libpqdump to support compilation as a backend extension, with use of the SPI for queries and use of syscaches or direct scans where possible. - Write a pg_dump extension that uses libpqdump in SPI mode to expose its API over SQL, or at least uses it to provide SQL functions to describe database objects. So you can dump a DB, or a subset of it, over SQL. After all, a libpgdump won't do much good for the large proportion of PostgreSQL users who use Java/JDBC, who can't use a native library (without hideous hacks with JNI). For the very large group who use libpq via language-specific client interfaces like the Pg gem for Ruby, psycopg2 for Python, DBD::Pg for Perl, etc, it'll require a lot of work to wrap the API and maintain it. Wheras a server-side SQL-callable interface would be useful and immediately usable for all of them. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore
Craig Ringer cr...@2ndquadrant.com writes: Here's how I think it needs to look: [ move all the functionality to the backend ] Of course, after you've done all that work, you've got something that is of exactly zero use to its supposed principal use-case, pg_dump. pg_dump will still have to support server versions that predate all these fancy new dump functions, and that pretty much ensures that most of pg_dump's core functionality will still be on the client side. Or, if you try to finesse that problem by making sure the new server APIs correspond to easily-identified pieces of pg_dump code, you'll probably end up with APIs that nobody else wants to use :-(. In any case, I quite agree with the sentiment that this is not a suitable problem for a GSOC project. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore
On 03/21/2014 11:09 AM, Tom Lane wrote: Craig Ringer cr...@2ndquadrant.com writes: Here's how I think it needs to look: [ move all the functionality to the backend ] Of course, after you've done all that work, you've got something that is of exactly zero use to its supposed principal use-case, pg_dump. pg_dump will still have to support server versions that predate all these fancy new dump functions, and that pretty much ensures that most of pg_dump's core functionality will still be on the client side. Or, if you try to finesse that problem by making sure the new server APIs correspond to easily-identified pieces of pg_dump code, you'll probably end up with APIs that nobody else wants to use :-(. Yeah, that's why it's necessary to create a libpqdump that's usable client-side even if you want server-side dump support. So it's allow the functionality to be used from the backend as well, not just move all the functionality to the backend. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore
Hello! Here is the text of my proposal which I've applied to GSoC. (and link https://docs.google.com/document/d/1s-Q4rzEysPxo-dINsk_eKFJOBoVjNYDrQ-Oh75gtYEM/edit?usp=sharing) Any suggestions and comments are welcome. * PostgreSQL GSoC 2014 proposal Project name Rewrite (add) pg_dump and pg_restore utilities as libraries (.so, .dll .dylib) Short description pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_restore is a utility for restoring a PostgreSQL database from an archive created by pg_dump in one of the non-plain-text formats. I think it will be more comforatable to use these tools as libraries. Name: Alexander Shvidchenko E-mail: askel...@gmail.com mailto:askel...@gmail.com Location: Rostov-on-Don, Russia (UTC +04.00) Benefits to the PostgreSQL Community This feature will expand opportunities to work with backups of databases. Especially, it’s important for third-party developers. It will be easier to use the functionality of the tools in applications when they will be libraries. Quantifiable results Static and dynamic libraries with the functionality of pg_dump and pg_restore tools (.so, .dll and .dylib files) Project Schedule until May 31 Make code review and solve architecture questions with help of community 1 June – 30 June Detailed implementation of libraries. 1 July – 31 July Finish Implementation of libraires and begin testing. 1 August -15 August Final refactoring, testing and commit. Academic experience I entered the university in 2013. Before entering the university I finished the college in 2012. My graduate work in the college was the client-server application. It was a realization of XMPP. The client was realized in Qt. The client worked with SQLite database and the server worked with MySQL database. Why is PostgreSQL? - I’m intereted in this idea and believe this project would be useful for the community; - PostgreSQL is a very respected community. I would be proud to be a part of it; - PostgreSQL is one of the best DBMS and I would like to make it better. Links 1) PostgreSQL 9.3.3 Documentation, pg_dump http://www.postgresql.org/docs/9.3/static/app-pgdump.html 2) PostgreSQL 9.3.3 Documentation, pg_restore http://www.postgresql.org/docs/9.3/static/app-pgrestore.html 3) Static library http://en.wikipedia.org/wiki/Static_library 4) Dynamic library http://en.wikipedia.org/wiki/Dynamic-link_library * With best wishes, Alexander S.