Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Bruce Momjianwrites: > Oh, I see what you mean. I was just worried that some code might expect > template1 to always have an oid of 1, but we can just call that code > broken. Ever since we invented template0, it's been possible to drop and recreate template1, so I'd say any expectation that it must have OID 1 has been wrong for a long time. This change will just make the situation more common. regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Sun, Jan 21, 2018 at 11:02:29AM -0500, Tom Lane wrote: > Bruce Momjianwrites: > > On Fri, Jan 19, 2018 at 02:54:25PM -0500, Tom Lane wrote: > >> Command was: DROP DATABASE "template1"; > > > Uh, the oid of the template1 database is 1, and I assume we would want > > to preserve that too. > > I don't feel any huge attachment to that. In the first place, under > this proposal recreating template1 is something you would only need to do > if you weren't satisfied with its default properties as set by initdb. > Which ought to be a minority of users. In the second place, if you are > changing those properties from the way initdb set it up, it's not really > virgin template1 anymore, so why shouldn't it have a new OID? Oh, I see what you mean. I was just worried that some code might expect template1 to always have an oid of 1, but we can just call that code broken. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
I've been hacking away at this patch, and attached is what I've got so far. I think this is committable, but if anyone wants to do further review and testing, that'd be fine. Per discussion, I got rid of the separate --set-db-properties switch: additional database properties are now applied if and only if --create is specified. But the DATABASE TOC entry itself still contains only CREATE DATABASE; the additional commands are carried in a "DATABASE PROPERTIES" TOC entry so that they will be issued after reconnecting to the new DB. This dodges the "ALTER DATABASE SET default_transaction_read_only" problem. (Furthermore, it turns out that we have to do it like that because pg_restore issues any one TOC entry's contents as a single PQexec. If you try to cram anything else in with the CREATE DATABASE, then the server spits up because CREATE DATABASE becomes part of an implicit transaction block.) I also fixed it so that a database's comment, security label, and ACL are restored only when saying --create. This is different from the previous behavior for DB comments and security labels, but it seems a great deal more useful and consistent. In no other case would pg_dump/pg_restore attempt to restore a comment, security label, or ACL for an object it hadn't created, so why should it act differently for databases? Worth noting here is that if we accept that behavior, the problem for which "COMMENT ON CURRENT_DATABASE" was proposed goes away, because there's no longer a risk of trying to apply COMMENT ON DATABASE to the wrong database. We might still want that patch as a standalone feature, but pg_dump no longer needs it. Another point worth commenting on (this change was also in the v13 patch) is that pg_dumpall formerly took some pains to omit the encoding and locale specifications in its CREATE DATABASE commands, for databases whose settings matched the cluster default. This new implementation won't do that. We could change it to do so, but then a standalone "pg_dump --create" would also act that way, which is not really easy to defend. IIRC, the argument for pg_dumpall doing it like that was to make it less painful to migrate a cluster to a new machine that might not have the identical set of locales, or if one wanted to migrate everything to a new encoding. But those are not mainstream use-cases, and if you really need to do that you can still get there by dumping databases individually without using --create. On the other hand, there are obvious gotchas involved in letting a dump/reload silently change to another locale or encoding. So I think this is an improvement overall, but it bears noting as a behavioral change. Another point to note is that I dithered about whether to bump the pg_dump archive version number, which would have the effect of preventing pre-v11 versions of pg_restore from processing dumps made by this version. The argument for breaking archive compatibility is that older pg_restore versions will not know that it'd be a good idea to skip DATABASE PROPERTIES TOC entries and database ACL entries if not using --create. However, in default cases there won't be a DATABASE PROPERTIES entry to skip. Moreover, applying these entries unconditionally isn't that much different conceptually from applying database comments or sec labels unconditionally, as such older pg_restore versions would do anyway. It also seems like if your back were against the wall, being able to read a newer archive file with an older pg_restore is better than being locked out of it completely. So I'm leaning to no version bump, but it could still be discussed. One thing we could possibly use here is more regression test cases. The only existing regression test that's affected by this patch is 002_pg_dump.pl's expectations about which cases will print a database comment, so it seems like we're missing some behavioral coverage. Not sure what that should look like though. This patch has to be applied over the patches proposed in https://www.postgresql.org/message-id/21714.1516553459%40sss.pgh.pa.us Aside from touching some of the same code, this is dependent on the changes made there to make comment, seclabel, and ACL entries reliably identifiable. As far as notable code changes go, I got rid of the previous patch's move of executeQuery() into dumputils.c. That had some undesirable knock-on effects in terms of creating even more coupling between different modules (through the g_verbose global), and it was basically misguided anyway. pg_dump executes queries via ExecuteSqlQuery; we do not need a few of its functions to be using low-level routines with behavior different from that. If anyone wants to do further review on this, let me know; otherwise I'll push it in a day or so. regards, tom lane diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml index 08cad68..11582dd 100644 *** a/doc/src/sgml/ref/pg_dump.sgml --- b/doc/src/sgml/ref/pg_dump.sgml
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Bruce Momjianwrites: > On Fri, Jan 19, 2018 at 02:54:25PM -0500, Tom Lane wrote: >> Command was: DROP DATABASE "template1"; > Uh, the oid of the template1 database is 1, and I assume we would want > to preserve that too. I don't feel any huge attachment to that. In the first place, under this proposal recreating template1 is something you would only need to do if you weren't satisfied with its default properties as set by initdb. Which ought to be a minority of users. In the second place, if you are changing those properties from the way initdb set it up, it's not really virgin template1 anymore, so why shouldn't it have a new OID? regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Fri, Jan 19, 2018 at 02:54:25PM -0500, Tom Lane wrote: > Hmm ... so there's a small problem with this idea of dropping and > recreating template1: > > pg_restore: connecting to database for restore > pg_restore: dropping DATABASE template1 > pg_restore: [archiver (db)] Error while PROCESSING TOC: > pg_restore: [archiver (db)] Error from TOC entry 3024; 1262 1 DATABASE > template1 > postgres > pg_restore: [archiver (db)] could not execute query: ERROR: cannot drop a > templ > ate database > Command was: DROP DATABASE "template1"; Uh, the oid of the template1 database is 1, and I assume we would want to preserve that too. -- Bruce Momjianhttp://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Robert Haaswrites: > On Fri, Jan 19, 2018 at 2:54 PM, Tom Lane wrote: >> What do people think of just removing this DROP DATABASE restriction? >> Arguably, superusers should know better than to drop template1 anyway. >> Maybe we should replace it with a hard-wired check against dropping >> template0 (matched by name) just to stave off the worst-case scenario. > I think it's a little scary. The tail might be wagging the dog at this point. True, and having to make assumptions about which server version we're restoring to is not very nice either. After further reflection I've found a way to do this on the pg_dump end that's not as ugly as I feared -- no uglier than most of the other hacks there, anyway. So nevermind ... regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Fri, Jan 19, 2018 at 2:54 PM, Tom Lanewrote: > Hmm ... so there's a small problem with this idea of dropping and > recreating template1: > > pg_restore: connecting to database for restore > pg_restore: dropping DATABASE template1 > pg_restore: [archiver (db)] Error while PROCESSING TOC: > pg_restore: [archiver (db)] Error from TOC entry 3024; 1262 1 DATABASE > template1 > postgres > pg_restore: [archiver (db)] could not execute query: ERROR: cannot drop a > templ > ate database > Command was: DROP DATABASE "template1"; > > Now in principle we could hack around that by issuing "ALTER DATABASE > ... IS_TEMPLATE false" first, but it turns out to be harder than you > might think to wedge that into the pg_dump infrastructure. (The > natural way to do it, trying to add this into the dropCmd for the > database TOC entry, fails because (a) DROP DATABASE then ends up as > one part of an implicit transaction block, and (b) it confuses the heck > out of pg_restore's --if-exists kluge.) > > You can actually exhibit this in current releases if you try "pg_dump > --clean --create" on a user-created template database, so it's not > solely the fault of this patch. > > What do people think of just removing this DROP DATABASE restriction? > Arguably, superusers should know better than to drop template1 anyway. > Maybe we should replace it with a hard-wired check against dropping > template0 (matched by name) just to stave off the worst-case scenario. I think it's a little scary. The tail might be wagging the dog at this point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Hmm ... so there's a small problem with this idea of dropping and recreating template1: pg_restore: connecting to database for restore pg_restore: dropping DATABASE template1 pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 3024; 1262 1 DATABASE template1 postgres pg_restore: [archiver (db)] could not execute query: ERROR: cannot drop a templ ate database Command was: DROP DATABASE "template1"; Now in principle we could hack around that by issuing "ALTER DATABASE ... IS_TEMPLATE false" first, but it turns out to be harder than you might think to wedge that into the pg_dump infrastructure. (The natural way to do it, trying to add this into the dropCmd for the database TOC entry, fails because (a) DROP DATABASE then ends up as one part of an implicit transaction block, and (b) it confuses the heck out of pg_restore's --if-exists kluge.) You can actually exhibit this in current releases if you try "pg_dump --clean --create" on a user-created template database, so it's not solely the fault of this patch. What do people think of just removing this DROP DATABASE restriction? Arguably, superusers should know better than to drop template1 anyway. Maybe we should replace it with a hard-wired check against dropping template0 (matched by name) just to stave off the worst-case scenario. regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Robert Haaswrites: > On Fri, Jan 19, 2018 at 10:00 AM, Tom Lane wrote: >> Robert Haas writes: >>> On Fri, Jan 19, 2018 at 9:45 AM, Tom Lane wrote: Well, we could say that the properties of template1 and postgres are only restored if you use --clean. >>> True. Would that be a POLA violation, do you think? >> It seems a bit non-orthogonal. Also, while your point that people >> expect "merge" behavior from pg_dump is certainly true, I'm not >> convinced that anybody would be relying on that for pg_dumpall. > [ assorted examples ] > Still, it's worth thinking over these kinds of > scenarios, I think. People do a lot of ugly things in the real world > that we as developers would never do, mostly to work around the > problems we fail to foresee. Unless someone has a better idea, I'll go with the semantics stated above: DB-level properties of the two standard databases are only transferred to pg_dumpall's target cluster if you authorize dropping their old contents by saying --clean. (pg_upgrade, of course, will do exactly that.) regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Fri, Jan 19, 2018 at 10:00 AM, Tom Lanewrote: > Robert Haas writes: >> On Fri, Jan 19, 2018 at 9:45 AM, Tom Lane wrote: >>> Well, we could say that the properties of template1 and postgres >>> are only restored if you use --clean. > >> True. Would that be a POLA violation, do you think? > > It seems a bit non-orthogonal. Also, while your point that people > expect "merge" behavior from pg_dump is certainly true, I'm not > convinced that anybody would be relying on that for pg_dumpall. What I expect typically happens is that someone restores into an existing cluster and then realizes their mistake and goes and removes all of the extra stuff that got created. This is a bit like when you run tar -zxf expecting it to create a subdirectory, but it turns out to extract in the current directory instead. You then curse your fate and remove all of the files it created. It's annoying, but not that bad. If it nuked the contents of the currently directory first, though, you'd be much sadder. If somebody changed tar to have that behavior, people would probably *eventually* become aware of the risk and adjust their behavior to avoid catastrophe, but it would probably take 1-2 disasters per individual before they got used to the new way, and that's a lot of disasters considering how many people use tar. Or, maybe people wouldn't get used to it and they'd just go after whoever made the change with pitchforks. Anyway, that kind of thing seems like a danger here. The other possibility that comes to mind is that somebody might be doing this is working around a failure of something like CREATE LANGUAGE. Say the name of the .so has changed, so CREATE LANGUAGE will fail or do the wrong thing. Instead of editing the dump, they pre-install the language with the correct definition and then restore over it, ignoring errors. I guess that would only work with pg_dump, not pg_dumpall, unless the database that had the problem was one of the pre-created ones. Still, it's worth thinking over these kinds of scenarios, I think. People do a lot of ugly things in the real world that we as developers would never do, mostly to work around the problems we fail to foresee. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Robert Haaswrites: > On Fri, Jan 19, 2018 at 9:45 AM, Tom Lane wrote: >> Well, we could say that the properties of template1 and postgres >> are only restored if you use --clean. > True. Would that be a POLA violation, do you think? It seems a bit non-orthogonal. Also, while your point that people expect "merge" behavior from pg_dump is certainly true, I'm not convinced that anybody would be relying on that for pg_dumpall. regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Robert Haaswrites: > On Thu, Jan 18, 2018 at 6:35 PM, Tom Lane wrote: >> If we did it like that, the rationale for an actual --set-db-properties >> switch would vanish, at least so far as pg_dumpall is concerned -- we >> could just make all that behavior an integral part of --create. And >> this wouldn't need to be conditional on getting ALTER DATABASE >> CURRENT_DATABASE done. > Unfortunately, I have a feeling that --set-db-properties might not be > the only thing that would vanish. I think users are accustomed by now > to the idea that if you restore into an existing database, the > existing contents are preserved and the new stuff from the dump is > added (possibly with some errors and messiness). With this design, > the existing database contents will instead vanish, and that is > probably going to make somebody unhappy. Well, we could say that the properties of template1 and postgres are only restored if you use --clean. regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Thu, Jan 18, 2018 at 6:35 PM, Tom Lanewrote: > If we did it like that, the rationale for an actual --set-db-properties > switch would vanish, at least so far as pg_dumpall is concerned -- we > could just make all that behavior an integral part of --create. And > this wouldn't need to be conditional on getting ALTER DATABASE > CURRENT_DATABASE done. Unfortunately, I have a feeling that --set-db-properties might not be the only thing that would vanish. I think users are accustomed by now to the idea that if you restore into an existing database, the existing contents are preserved and the new stuff from the dump is added (possibly with some errors and messiness). With this design, the existing database contents will instead vanish, and that is probably going to make somebody unhappy. I agree with you that making ALTER DATABASE accept CURRENT_DATABASE is a good idea. I don't have a great idea what to do about the SET TABLESPACE problem. It's always seemed to me to be sort of weird that you have to have a database in order to create, drop, etc. another database, or even get a list of databases that exist. As a new user, I remember being quite frustrated that connecting to a database that didn't exist gave no hint of how to find out what databases did exist. Eventually I discovered psql -l and all was well, but I had no idea how it worked under the hood. Even though I now understand the architectural considerations that have gotten us to where we are, I still think it would be more intuitive to users if there were a command-shell unrelated to any database that would let you perform operations on databases. I realize, however, that this patch isn't going to create such a thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Fri, Jan 19, 2018 at 10:35 AM, Tom Lanewrote: > I wrote: > > What I think we should do for the time being is to have pg_dump treat > > database tablespace as a property it can't adjust after creation, just > > as it can't adjust locale or encoding. That's a loss of functionality > > for pg_dumpall/pg_upgrade compared to where we are today, in that if > > you've set up the template1 or postgres DBs with nondefault tablespace > > then that won't propagate to the new cluster. But the same can already > > be said about their locale and encoding, and I find it hard to believe > > that many people are trying to give those two DBs tablespace settings > > different from the cluster default, anyway. > > Hm ... actually, there is more than one way to skin this cat. > > Let me offer a modest proposal: pg_dumpall/pg_upgrade should simply DROP > postgres and template1 in the target cluster, and then re-create them > (from template0 of course). With that, we'd not only cope with preserving > their tablespace settings, but we'd gain the ability to preserve their > locale and encoding, even if the target cluster had been initialized with > some other default. > Yes, I agree that this may be simple change to handle this problem. Already pg_upgrade doesn't work if there is any encoding difference between source and target databases. Most probably User will create with same encoding. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
I wrote: > What I think we should do for the time being is to have pg_dump treat > database tablespace as a property it can't adjust after creation, just > as it can't adjust locale or encoding. That's a loss of functionality > for pg_dumpall/pg_upgrade compared to where we are today, in that if > you've set up the template1 or postgres DBs with nondefault tablespace > then that won't propagate to the new cluster. But the same can already > be said about their locale and encoding, and I find it hard to believe > that many people are trying to give those two DBs tablespace settings > different from the cluster default, anyway. Hm ... actually, there is more than one way to skin this cat. Let me offer a modest proposal: pg_dumpall/pg_upgrade should simply DROP postgres and template1 in the target cluster, and then re-create them (from template0 of course). With that, we'd not only cope with preserving their tablespace settings, but we'd gain the ability to preserve their locale and encoding, even if the target cluster had been initialized with some other default. If we did it like that, the rationale for an actual --set-db-properties switch would vanish, at least so far as pg_dumpall is concerned -- we could just make all that behavior an integral part of --create. And this wouldn't need to be conditional on getting ALTER DATABASE CURRENT_DATABASE done. Comments? regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
... okay, so the reason why --set-db-properties isn't as general-purpose a switch as you might think finally penetrated my thick skull :-( The problem is that, as the patch is currently constituted, that will result in emitting a lot of "ALTER DATABASE foo" type commands. If somebody tries to load the pg_dump output into a database not named foo, mayhem ensues. This is exactly the same problem people have already noted with respect to COMMENT ON DATABASE, but now we're propagating it to other operations that have (much) higher downsides for getting it wrong. What we need, therefore, is ALTER DATABASE CURRENT_DATABASE, which I see is part of the pending patch for fixing the COMMENT ON DATABASE problem. That one is stuck in Waiting on Author, but I'm inclined to go see if I can push it across the finish line, and then come back to this one. Even with that, there's a small problem: the backend cannot reasonably support ALTER DATABASE CURRENT_DATABASE SET TABLESPACE, except perhaps in the no-op case where the specified tablespace is already the right one. So this puts a serious hole in my thoughts about allowing the tablespace to be adjusted during --set-db-properties without --create. What I think we should do for the time being is to have pg_dump treat database tablespace as a property it can't adjust after creation, just as it can't adjust locale or encoding. That's a loss of functionality for pg_dumpall/pg_upgrade compared to where we are today, in that if you've set up the template1 or postgres DBs with nondefault tablespace then that won't propagate to the new cluster. But the same can already be said about their locale and encoding, and I find it hard to believe that many people are trying to give those two DBs tablespace settings different from the cluster default, anyway. The only way around that that I can see is to give pg_dump some additional switch along the lines of --i-promise-i-wont-restore- this-into-a-database-with-a-different-name, and then it could emit the same sort of \connect template1 alter database postgres set tablespace ...; \connect postgres dance that pg_dumpall is using today. But that sort of ugliness is exactly what we said we didn't want in this patch. regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Robert Haaswrites: > On Wed, Jan 17, 2018 at 6:50 PM, Tom Lane wrote: >> * As the patch stands, --set-db-properties is implicit when you specify >> -C, and in fact the patch goes to the trouble of throwing an error if you >> try to specify both switches. I'm inclined to think this might be a bad >> idea. What about saying that -C enables emitting CREATE DATABASE and >> reconnecting to that DB, and independently of that, --set-db-properties >> enables emitting the additional per-database properties? This seems >> simpler to understand, more flexible, and less of a change from the >> previous behavior of -C. On the other hand you could argue that people >> would always want --set-db-properties with -C and so we're merely >> promoting carpal tunnel (and errors of omission) if we do it like that. > I would vigorously agree with that latter argument. I think the > chances of errors of omission would be very high even if the carpal > tunnel dangers were ameliorated by giving --set-db-properties a short > option name. Fair enough. We'll keep the behavioral change then. >> If so, maybe we could say "-C implies --set-db-properties by default, but >> if you really don't want that, you can say -C --no-set-db-properties". > That seems like a better idea. What I think I'll do for the moment is make them independent options so far as the implementation is concerned, but have the command line switch processing do /* --create implies --set-db-properties, for now anyway */ if (dopt.outputCreateDB) dopt.set_db_properties = 1; If somebody actually asks for --no-set-db-properties, we can add that later. regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Wed, Jan 17, 2018 at 6:50 PM, Tom Lanewrote: > * As the patch stands, --set-db-properties is implicit when you specify > -C, and in fact the patch goes to the trouble of throwing an error if you > try to specify both switches. I'm inclined to think this might be a bad > idea. What about saying that -C enables emitting CREATE DATABASE and > reconnecting to that DB, and independently of that, --set-db-properties > enables emitting the additional per-database properties? This seems > simpler to understand, more flexible, and less of a change from the > previous behavior of -C. On the other hand you could argue that people > would always want --set-db-properties with -C and so we're merely > promoting carpal tunnel (and errors of omission) if we do it like that. I would vigorously agree with that latter argument. I think the chances of errors of omission would be very high even if the carpal tunnel dangers were ameliorated by giving --set-db-properties a short option name. > If so, maybe we could say "-C implies --set-db-properties by default, but > if you really don't want that, you can say -C --no-set-db-properties". That seems like a better idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Haribabu Kommiwrites: > [ pg_dump-and-pg_dumpall-database-handling-refactoring_v12.patch ] I've gone through this in a once-over-lightly fashion. Since there was quite a bit of debate upthread about how things should work, I'd like to start by summarizing the decisions this patch has made, in case anyone still wants to object to them: * pg_dump will now be responsible for dumping all per-database properties, including "ALTER ROLE IN DATABASE SET" settings. To get that behavior, pg_dumpall will invoke pg_dump using -C, or using the new switch --set-db-properties when working on "template1" or "postgres" databases, since those should already exist in the target installation. * pg_dumpall still won't dump template0 (or any other not-datallowconn database). This means that it will no longer propagate any non-default per-database properties for such databases. I think this is fine for template0: if you've messed with that at all, you are fooling with non user-serviceable parts. It's a bit less good for other DBs maybe, but on the whole it seems more consistent to simply treat nonconnectable DBs as not existing. (That could stand to be documented though.) * "pg_dumpall -g" will now produce only role- and tablespace-related output. * There isn't any direct way to ask pg_dump for "just the DB properties and nothing else" (although I suppose you can mostly fake it by adding "--schema nosuchschema" or some such). This makes it inconvenient to exactly duplicate the old behavior of "pg_dumpall -g", if someone wanted to do that to fit into their existing backup procedures. I'm not sure how important that is. Personally I'm content to lose it, but if there's enough pushback maybe we could invent a "--db-properties-only" switch? There are some other points that haven't been debated: * As the patch stands, --set-db-properties is implicit when you specify -C, and in fact the patch goes to the trouble of throwing an error if you try to specify both switches. I'm inclined to think this might be a bad idea. What about saying that -C enables emitting CREATE DATABASE and reconnecting to that DB, and independently of that, --set-db-properties enables emitting the additional per-database properties? This seems simpler to understand, more flexible, and less of a change from the previous behavior of -C. On the other hand you could argue that people would always want --set-db-properties with -C and so we're merely promoting carpal tunnel (and errors of omission) if we do it like that. If so, maybe we could say "-C implies --set-db-properties by default, but if you really don't want that, you can say -C --no-set-db-properties". Perhaps the only application of this is to reproduce pg_dump's historical behavior, but that's probably of some value. * The patch fails to make any provision for deciding at pg_restore time whether to emit DB properties. Considering that you can use -C at restore time, I think it's pretty awful that you can't use --set-db-properties then. Moreover, not only has the patch not added a separate "DATABASE PROPERTIES" TOC entry type as was originally proposed, what it's actually doing is emitting two identically labeled "DATABASE" TOC entries, one with the CREATE and the other with the rest. That's simply horrid. I think we need to do this properly and emit two distinguishable TOC entries, and control which of them get printed on the restore side of the logic, not the TOC entry construction side. (I'm not sure at this point if we need an archive version bump to do that, but perhaps not --- in the past we've added new TOC types without a version bump.) * Some years ago, commit 4bd371f6f caused pg_dumpall to emit "SET default_transaction_read_only = off;" in its scripts, so that it could successfully handle DBs that have "SET default_transaction_read_only = on" as a database property. This was intentionally NOT done to pg_dump, arguing that doing so would greatly weaken such a setting as a defense against stupid errors (i.e. restoring into your production database). The patch currently ignores that reasoning and moves the setting into pg_dump anyway. Haribabu mentioned this point upthread and got basically no response, but I'm concerned about it. I think however that we might be able to dodge that issue as a byproduct of fixing the previous problem. If the CREATE DATABASE is done as one TOC entry, and we reconnect to the target DB after processing that entry, and then we process the DATABASE PROPERTIES entry, then any DB property settings we've installed will not apply in our existing session. So I think we can just leave out the "SET default_transaction_read_only = on" altogether. Obviously this puts a premium on not reconnecting, but reconnecting would break other more important features like --single-transaction, so I am not worried about that. * An issue in connection with that is that because you can't issue ALTER DATABASE SET TABLESPACE against the
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Thu, Jan 18, 2018 at 8:25 AM, Tom Lanewrote: > Haribabu Kommi writes: > > [ pg_dump-and-pg_dumpall-database-handling-refactoring_v12.patch ] > > I started to look through this, and almost immediately found that the > diff in t/002_pg_dump.pl doesn't seem necessary --- the test passes > for me without applying that hunk. Is that a leftover from a previous > patch iteration, or is there some platform dependency in the test? > If it's not necessary, I'd be inclined to leave it as it was. > Thanks for the review. Yes, it is a left over from previous patch iteration. The test used to fail with this patch earlier. Now there is no problem in test even after removing the hunk from my side also. There is no platform dependency. Attached is an updated patch after removing the test changes. Regards, Hari Babu Fujitsu Australia pg_dump-and-pg_dumpall-database-handling-refactoring_v13.patch Description: Binary data
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Haribabu Kommiwrites: > [ pg_dump-and-pg_dumpall-database-handling-refactoring_v12.patch ] I started to look through this, and almost immediately found that the diff in t/002_pg_dump.pl doesn't seem necessary --- the test passes for me without applying that hunk. Is that a leftover from a previous patch iteration, or is there some platform dependency in the test? If it's not necessary, I'd be inclined to leave it as it was. regards, tom lane
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Fri, Jan 5, 2018 at 4:32 PM, Haribabu Kommiwrote: > > > Corrected. > Updated patch attached. > > Moved this CF item to "Ready for committer" Regards, Vaishnavi, Fujitsu Australia.
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Wed, Nov 29, 2017 at 4:25 PM, Michael Paquierwrote: > On Wed, Nov 8, 2017 at 8:50 AM, Haribabu Kommi > wrote: > > Ok. Removed the documentation changes that it cannot be used for normal > > scenarios, and also added a Note section explaining the problem of using > > the dump with pg_restore command with --clean and --create options. > > Hari, the documentation portion of the patch does not apply. Could you > rebase? For now I am moving it to next CF as this did not get any > reviews, and the status is switched to "waiting on author". > Rebased patch attached that fixes the documentation build problem. Regards, Hari Babu Fujitsu Australia pg_dump-and-pg_dumpall-database-handling-refactoring_v11.patch Description: Binary data
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On Wed, Nov 8, 2017 at 8:50 AM, Haribabu Kommiwrote: > Ok. Removed the documentation changes that it cannot be used for normal > scenarios, and also added a Note section explaining the problem of using > the dump with pg_restore command with --clean and --create options. Hari, the documentation portion of the patch does not apply. Could you rebase? For now I am moving it to next CF as this did not get any reviews, and the status is switched to "waiting on author". -- Michael