Re: [HACKERS] pg_dump and pg_dumpall in real life (proposal)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/12/2013 03:28 PM, Stephen Frost wrote: > * Rafael Martinez (r.m.guerr...@usit.uio.no) wrote: >> Comments? > > Create a wiki page for it. :) > What about this to start with?: https://wiki.postgresql.org/wiki/Pg_dump_improvements - -- Rafael Martinez Guerrero Center for Information Technology University of Oslo, Norway PGP Public Key: http://folk.uio.no/rafael/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlKCTfAACgkQBhuKQurGihSrYQCeKyYVthpbk47hGjayBjidqaFL nysAn3JJjGT/8SuDUi2Nt6hEZ4eu1smz =7wjV -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life (proposal)
* Rafael Martinez (r.m.guerr...@usit.uio.no) wrote: > Comments? Create a wiki page for it. :) Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_dump and pg_dumpall in real life (proposal)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/11/2013 11:20 PM, Josh Berkus wrote: > On 11/11/2013 06:24 AM, Stephen Frost wrote: >> * Rafael Martinez (r.m.guerr...@usit.uio.no) wrote: >>> * We need a pg_dump solution that can generate in one step all >>> the necessary pieces of information needed when restoring or >>> cloning a database. (schema, data, privileges, users and alter >>> database/role data) >> [.] > >> As for 'what we need', I'd think someone with the time and energy >> to write the patch and work with the community to implement it.. > > +1 > Well, I am willing to take a chance on the first suggestion if nobody else has the time or energy. I have never sent a patch or have worked with the postgres code, but I think it can be done without a lot of work with some reuse of the code used in pg_dumpall. This is a proposal based on the feedback we have received: * pg_dump will also deliver information about "ALTER DATABASE ... SET" data for a given database when the option '--create' is used. * pg_dump will deliver information about ROLES used and "ALTER ROLE ... SET" data for a given database when a new option i.e. "--roles-global" is used. * pg_restore will restore ROLE information when used with a new option i.e. "--roles-global" and "ALTER DATABASE ... SET" information when used with the '--create' option. * We need to do something with how pg_restore will handle ROLES information because some security concerns when restoring roles that already exists on the target server. Some of the suggestions are: a) Implement and use CREATE ROLE IF NOT EXISTS and just throw a warning or handle the "role already exists" error message gracefully. b) Use a new option i.e. "--reuse-roles-in-conflict" to behave like suggestion a). If this option is not used, pg_restore will stop with a fatal error when a role already exist. c) Use a new option i.e. "--on-role-error-stop" to stop with a fatal error when a role already exist. If this option is not used pg_restore will behave like suggestion a). d) Use a new option i.e. "--rename-roles-in-conflict" to rename the roles that already exists. If this option is not used, pg_restore will stop with a fatal error when a role already exist. I think I prefer b) to continue with the postgres spirit of security by default. d) is too complicated for me due to lack of knowledge of the postgres code. Comments? regards, - -- Rafael Martinez Guerrero Center for Information Technology University of Oslo, Norway PGP Public Key: http://folk.uio.no/rafael/ -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.14 (GNU/Linux) iEYEARECAAYFAlKB8w0ACgkQBhuKQurGihSq7QCfZdzreRGNRx0vUzXPjYqzNOIP LqgAoJnfeCYjsfEUmsYvvp3DSL959IRL =8Ynv -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/11/2013 09:59 PM, Rafael Martinez wrote: > * We need a pg_dump solution that can generate in one step all the > necessary pieces of information needed when restoring or cloning a > database. (schema, data, privileges, users and alter database/role > data) ... and if some users/roles already exist, but have different meanings? Or some roles exist and some don't? I'm very strongly against adding and using CREATE ROLE IF NOT EXISTS. pg_restore should handle this case-by-case, forcing the user to specify explicitly role-by-role that they want a given role in the existing DB re-used if it exists, or want a new one created with a new name in case of a clash. A --rename-all-conflicting-roles and --reuse-all-conflicting-roles option could be added for the all-or-none options. IMO if neither is specified, the existence of any role name conflict should be a restore error. > * It would be great to be able to tell pg_restore that user1 in > the dump will became user2 in the restored/cloned database. The > same for the name of the database. Agreed that this would be useful. Needs to deal with the case where the users should be separated but they should remain a member of some common role, though - eg "olduser" becomes "newuser" but the dumped "olduser" was member of role "users" and "newuser" should also be member of "users", not some renamed role. - -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.15 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSgcfKAAoJELBXNkqjr+S2XWIH/2c1Mcd4ldVTAPw/jAln4gNM YH8SRPlsGU0fqfbYoKg/1y0K/Wdjdlac9bjjGzYpODryaXGopf1i+pWaphF2kJTM LeMRVgEFEW7u2Dr6FXajQTQCiXLnA8C16NmmgIdqZZgYCsOwCorG+gFNfI8fZyft okCQpYcljGXzlc218DI6/o4OZBBSdLh8diTzF8+xywoXJZopdAwfHDPPpAvizPye rcUUkq1svArq78HakSuI8HoCy3ZHuiCf8mQEUPcLhFrwgh+bkrs29W7YAdD75gr4 yp32XeyOY5npXHaG9mHghs7anbUnwywJVEzpwKAf0SyPe7zunw8fdtx2NSF70no= =dusS -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
On Mon, Nov 11, 2013 at 8:20 PM, Josh Berkus wrote: > > > [...] > > Well, then we just need pg_restore to handle the "role already exists" > error message gracefully. That's all. Or a "CREATE ROLE IF NOT EXISTS" > statement, and use that for roles. > I'm working in a patch to add IF NOT EXISTS for all CREATE statements, including of course the CREATE ROLE statement. Regards, -- FabrÃzio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] pg_dump and pg_dumpall in real life
On 11/11/2013 05:50 PM, David Johnston wrote: Andrew Dunstan wrote A general ability to rename things would be good. In particular, restoring schema x into schema y or table x into table y would be very useful, especially if you need to be able to compare old with new. compare old and new what? Data is what I had in mind. There have been plenty of times when I've been asked ex post to find out what's changed in some table in the last 24 hours or something like that, and all I've had to work with is yesterday's dump file. The handsprings you have to turn in order to get the old version of the table and the new version side by side make it painful - it would be nice to be able to say "restore this table but with that name," or "restore this table but into that schema". cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
On 11/11/2013 03:06 PM, David Johnston wrote: > Josh Berkus wrote >> Well, then we just need pg_restore to handle the "role already exists" >> error message gracefully. That's all. Or a "CREATE ROLE IF NOT EXISTS" >> statement, and use that for roles. > > My only qualm here is if the exists check is based off of role name only. > If database "A" and database "B" came from different clusters but both have > a role "david" the actual identity of "david" is (could be) different > because the source cluster. > > The risk of such occurring is a high-security situation is likely to be > small but some kind of "--ignore-different-cluster-same-role" flag may be > worthwhile such that pg_restore will error unless that flag is set (i.e., > high security by default). The error itself should be rare enough most > people wouldn't even notice it is there but seeing such an error (with a > hint provided as well) would be easily able to disable and continue on with > the restore. I'd do the opposite: let's optimize for the most common case, not the least common one. So we'd do --role-errors, which would throw a fatal error on duplicate roles, instead of just posting a WARNING. Again, this is all rather academic, unless you know someone who's eager to dig into pg_dump/pg_restore. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
Josh Berkus wrote > Well, then we just need pg_restore to handle the "role already exists" > error message gracefully. That's all. Or a "CREATE ROLE IF NOT EXISTS" > statement, and use that for roles. My only qualm here is if the exists check is based off of role name only. If database "A" and database "B" came from different clusters but both have a role "david" the actual identity of "david" is (could be) different because the source cluster. The risk of such occurring is a high-security situation is likely to be small but some kind of "--ignore-different-cluster-same-role" flag may be worthwhile such that pg_restore will error unless that flag is set (i.e., high security by default). The error itself should be rare enough most people wouldn't even notice it is there but seeing such an error (with a hint provided as well) would be easily able to disable and continue on with the restore. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/pg-dump-and-pg-dumpall-in-real-life-tp518p5777823.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
Andrew Dunstan wrote > A general ability to rename things would be good. In particular, > restoring schema x into schema y or table x into table y would be very > useful, especially if you need to be able to compare old with new. compare old and new what? I would imagine that schema comparisons would be much easier if the only thing that is different is the database name and you compare database "old" to database "new". Are there any existing threads or posts, that you recollect, that detail solid use-cases for "clone-and-rename" mechanics? I don't seem to recall anything in the past year or so but my coverage is probably only about 70% in that timeframe. SQL seems particularly unfriendly to renaming and runtime name resolution in general (largely due to caching effects). Some kind of alias mechanism makes sense conceptually but the performance hit for such isn't likely to be worth incurring. I could see having table name aliases so that raw data in a dump from one database could be restored into another but I'd likely require that the user be able to generate the target schema from source themselves. That would facilitate the use-case where the DBA/programmer is able to fully recreate their schema from source and only require that actual data be restored into the newly created database. I can see where grants may fall into a grey middle-area but functions/view/triggers and the like would need to be synchronized with any schema naming changes and that should, IMO, be driven from source and not facilitated by a dump/restore process. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/pg-dump-and-pg-dumpall-in-real-life-tp518p5777816.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
On 11/11/2013 06:24 AM, Stephen Frost wrote: > * Rafael Martinez (r.m.guerr...@usit.uio.no) wrote: >> * We need a pg_dump solution that can generate in one step all the >> necessary pieces of information needed when restoring or cloning a >> database. (schema, data, privileges, users and alter database/role data) > > This sounds pretty reasonable and should be possible to implement- but > realize that, on the restore side, you might end up with multiple > attempts to create the same objects. Consider a role that's depended > upon by objects in two databases- it would be included in the dump of > both of those databases and if you restored both of those into the same > cluster, one of the CREATE ROLE statements would fail. Well, then we just need pg_restore to handle the "role already exists" error message gracefully. That's all. Or a "CREATE ROLE IF NOT EXISTS" statement, and use that for roles. >> * It would be great to be able to tell pg_restore that user1 in the >> dump will became user2 in the restored/cloned database. The same for >> the name of the database. > > This is a lot uglier, unfortunately. We've seen this multiple times > before- there's not a good way to provide such a mapping as a command > line option. There may also be issues with the dependency resolution.. This sounds like one of the reasons we still *have* text-mode dumps. For stuff like this. > As for 'what we need', I'd think someone with the time and energy to > write the patch and work with the community to implement it.. +1 -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
On 11/11/2013 08:59 AM, Rafael Martinez wrote: * It would be great to be able to tell pg_restore that user1 in the dump will became user2 in the restored/cloned database. The same for the name of the database. A general ability to rename things would be good. In particular, restoring schema x into schema y or table x into table y would be very useful, especially if you need to be able to compare old with new. Unfortunately, this would involve a fairly significant change in the design of pg_dump / pg_restore. The stored SQL is currently fairly opaque, and a renaming scheme would probably need to implement instead some sort of placeholder mechanism. That would mean a LOT of work. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and pg_dumpall in real life
* Rafael Martinez (r.m.guerr...@usit.uio.no) wrote: > * We need a pg_dump solution that can generate in one step all the > necessary pieces of information needed when restoring or cloning a > database. (schema, data, privileges, users and alter database/role data) This sounds pretty reasonable and should be possible to implement- but realize that, on the restore side, you might end up with multiple attempts to create the same objects. Consider a role that's depended upon by objects in two databases- it would be included in the dump of both of those databases and if you restored both of those into the same cluster, one of the CREATE ROLE statements would fail. I'd think this would also be a new pg_dump option along the lines of 'include global dependencies' or similar. Reading the older threads, I also agree that a '--create' version of pg_dump should include the various SET commands for the database to be configured the same as the one being dump'd. The next part seems simple- let's get someone to do it.. :) > * It would be great to be able to tell pg_restore that user1 in the > dump will became user2 in the restored/cloned database. The same for > the name of the database. This is a lot uglier, unfortunately. We've seen this multiple times before- there's not a good way to provide such a mapping as a command line option. There may also be issues with the dependency resolution.. > * For serious backup management of large and complicated databases, > pg_dump with the custom output + pg_restore is the only feasible solution. Sure; is there a question here? I don't think that means we're going to change the default, though there is a whole other thread on that subject. > What do you think about the subject? Does it sound like a reasonable > proposition? What do we need to implement some of these changes? As for 'what we need', I'd think someone with the time and energy to write the patch and work with the community to implement it.. Thanks, Stephen signature.asc Description: Digital signature