Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On 9/27/16 6:57 PM, Vitaly Burovoy wrote: > On 9/27/16, Vitaly Burovoy wrote: >> On 9/27/16, Tom Lane wrote: >>> (The other thing I'd want here is a --target-version option so that >>> you could get the same output alterations in pg_dump or pg_restore to >>> text. Otherwise it's nigh undebuggable, and certainly much harder >>> to test than it needs to be.) >> >> I thought that way. I'm ready to introduce that parameter, but again, >> I see now it will influence only SET parameters. Does it worth it? > > The only reason I have not implemented it was attempt to avoid users > being confused who could think that result of pg_dump (we need it > there for the plain text output) or pg_restore can be converted for > target version to be restored without new features (but now it is > wrong). I'm in favor of making this work. But I also agree with the earlier commenters that we shouldn't overload remoteVersion to mean sometimes source and sometimes target version. It would probably be enough for now to leave remoteVersion to mean source version, introduce a new field targetVersion, default that to remoteVersion in plain text format, and set it to the actual remote version when restoring directly to a database. It will need a bit of work to tie it all together, but it shouldn't be too difficult. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
2016-09-27 23:12 GMT+02:00 Tom Lane : > Vitaly Burovoy writes: > > On 9/27/16, Tom Lane wrote: > >> I'm not exactly convinced that you did. There's only one copy of > >> Archive->remoteVersion, and you're overwriting it long before the > >> dump process is over. > > > It does not seem that I'm "overwriting it long before the dump process > > is over"... > > There's a lot that happens during RestoreArchive. Even if none of it > inspects remoteVersion today, I do not think that's a safe assumption to > make going forward. The easiest counterexample is that this very bit of > code you want to add does so. I really do not want to get into a design > that says "remoteVersion means the source server version until we reach > RestoreArchive, and the target version afterwards". That way madness > lies. If we're going to try altering the emitted SQL based on target > version, let's first create a separation between those concepts; otherwise > I will bet that we add more bugs than we remove. > > (The other thing I'd want here is a --target-version option so that > you could get the same output alterations in pg_dump or pg_restore to > text. Otherwise it's nigh undebuggable, and certainly much harder > to test than it needs to be.) > This options likes like very good idea. Regards Pavel > > regards, tom lane > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On 9/27/16, Vitaly Burovoy wrote: > On 9/27/16, Tom Lane wrote: >> (The other thing I'd want here is a --target-version option so that >> you could get the same output alterations in pg_dump or pg_restore to >> text. Otherwise it's nigh undebuggable, and certainly much harder >> to test than it needs to be.) > > I thought that way. I'm ready to introduce that parameter, but again, > I see now it will influence only SET parameters. Does it worth it? The only reason I have not implemented it was attempt to avoid users being confused who could think that result of pg_dump (we need it there for the plain text output) or pg_restore can be converted for target version to be restored without new features (but now it is wrong). -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On 9/27/16, Tom Lane wrote: > Vitaly Burovoy writes: >> On 9/27/16, Tom Lane wrote: >>> I'm not exactly convinced that you did. There's only one copy of >>> Archive->remoteVersion, and you're overwriting it long before the >>> dump process is over. > >> It does not seem that I'm "overwriting it long before the dump process >> is over"... > > There's a lot that happens during RestoreArchive. Even if none of it > inspects remoteVersion today, I do not think that's a safe assumption to > make going forward. And... Khm... Note that even _now_ AHX->remoteVersion is set to a database version pg_restore connects to... So all the code has it during restoring process... > The easiest counterexample is that this very bit of > code you want to add does so. The only change I've done is set remoteVersion to the maximum allowed when output is the plain text format. > I really do not want to get into a design > that says "remoteVersion means the source server version until we reach > RestoreArchive, and the target version afterwards". That way madness lies. It is only if you think about "remoteVersion" as "sourceServerVersion", but even now it is not so. Moreover RestoreArchive is a delimter only for pg_dump and only when output is a plain text. For other modes of the pg_dump RestoreArchive is not called at all. > If we're going to try altering the emitted SQL based on target version, > let's first create a separation between those concepts; I've just found there is _archiveHandle.archiveRemoteVersion. Is it a parameter you were searched for? The pg_restore code does not use it (just as remoteVersion), but it can do so if it is necessary. > otherwise I will bet that we add more bugs than we remove. > > (The other thing I'd want here is a --target-version option so that > you could get the same output alterations in pg_dump or pg_restore to > text. Otherwise it's nigh undebuggable, and certainly much harder > to test than it needs to be.) I thought that way. I'm ready to introduce that parameter, but again, I see now it will influence only SET parameters. Does it worth it? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
Vitaly Burovoy writes: > On 9/27/16, Tom Lane wrote: >> I'm not exactly convinced that you did. There's only one copy of >> Archive->remoteVersion, and you're overwriting it long before the >> dump process is over. > It does not seem that I'm "overwriting it long before the dump process > is over"... There's a lot that happens during RestoreArchive. Even if none of it inspects remoteVersion today, I do not think that's a safe assumption to make going forward. The easiest counterexample is that this very bit of code you want to add does so. I really do not want to get into a design that says "remoteVersion means the source server version until we reach RestoreArchive, and the target version afterwards". That way madness lies. If we're going to try altering the emitted SQL based on target version, let's first create a separation between those concepts; otherwise I will bet that we add more bugs than we remove. (The other thing I'd want here is a --target-version option so that you could get the same output alterations in pg_dump or pg_restore to text. Otherwise it's nigh undebuggable, and certainly much harder to test than it needs to be.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On 9/27/16, Tom Lane wrote: > Vitaly Burovoy writes: >> On 9/27/16, Tom Lane wrote: >>> The general policy has always been that pg_dump output is only expected >>> to >>> restore without errors into a server that's the same or newer version as >>> pg_dump (regardless of the server version being dumped from). > >> Why can't I use it if pg_dump92/pg_restore92 have the same behavior as >> pg_dump96/pg_restore96 except the SET block? > > That's a pretty large "if", and not one I'm prepared to commit to. > Before you assert that it's true, you might want to reflect on the > size of the diff between 9.2 and 9.6 pg_dump (hint: it's over 20K > lines in -u format). > >>> Either that, or the patch is overwriting pg_dump's idea of what the >>> source server version is at the start of the output phase, which is >>> likely to break all kinds of stuff when dumping from an older server.) > >> I agree, that's why I left current behavior as is for the plain text >> output. > > I'm not exactly convinced that you did. There's only one copy of > Archive->remoteVersion, and you're overwriting it long before the > dump process is over. I'm sorry, I'm not so good at knowing the code base but I see that my patch changes Archive->remoteVersion in the "RestoreArchive" which is called after all schema is fetched to pg_dump's structs and just before output is written, moreover, there is a comment that it is a final stage (9.2 has the same block of code): ... /* * And finally we can do the actual output. *... */ if (plainText) RestoreArchive(fout); CloseArchive(fout); exit_nicely(0); } It does not seem that I'm "overwriting it long before the dump process is over"... Also pg_dump -v proves that changing remoteVersion happens after all "pg_dump: finding *", "pg_dump: reading *" and "pg_dump: saving *". > That'd break anything that consulted it to > find the source server's version after RestoreArchive starts. I'm sorry again, could you or anyone else point me what part of the code use Archive->remoteVersion after schema is read? I set up break point at the line AHX->remoteVersion = 99; and ran pg_dump with plain text output to a file (via "-f" option). When the line was reached I set up break points at all lines I could find by a script: egrep -n -r --include '*.c' 'remoteVersion [><]' src/bin/pg_dump/ | awk -F: '{print "b "$1":"$2}' (there were 217 of them) and continued debuging. All three fired break points were expectedly in _doSetFixedOutputState (at the lines where my patch introduced them) after which the program exited normally without stopping. Also I wonder that the process of "restoring" consults Archive->remoteVersion because currently in the pg_restore to plain text output remoteVersion is zero. It means restoring process would output schema to the oldest supported version, but is not so. > I could get behind a patch that split remoteVersion into sourceVersion > and targetVersion and made sure that all the existing references to > remoteVersion were updated to the correct one of those. After that > we could do something like what you have here in a less shaky fashion. > As Robert noted, there'd probably still be a long way to go before it'd > really work to use a newer pg_dump with an older target version, but at > least we'd not be building on quicksand. It means support of restoring from a newer version to an older one. What's for? If new features are used it is impossible to restore (port) them to an older version, if they are not, restoring process is done (i guess) in 99% of cases. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
Vitaly Burovoy writes: > On 9/27/16, Tom Lane wrote: >> The general policy has always been that pg_dump output is only expected to >> restore without errors into a server that's the same or newer version as >> pg_dump (regardless of the server version being dumped from). > Why can't I use it if pg_dump92/pg_restore92 have the same behavior as > pg_dump96/pg_restore96 except the SET block? That's a pretty large "if", and not one I'm prepared to commit to. Before you assert that it's true, you might want to reflect on the size of the diff between 9.2 and 9.6 pg_dump (hint: it's over 20K lines in -u format). >> Either that, or the patch is overwriting pg_dump's idea of what the >> source server version is at the start of the output phase, which is >> likely to break all kinds of stuff when dumping from an older server.) > I agree, that's why I left current behavior as is for the plain text output. I'm not exactly convinced that you did. There's only one copy of Archive->remoteVersion, and you're overwriting it long before the dump process is over. That'd break anything that consulted it to find the source server's version after RestoreArchive starts. It might not be a bad thing to clearly distinguish source server version from (expected?) target server version, but pg_dump doesn't do that currently, and making it do so is probably not entirely trivial. I think your patch confuses that distinction further, which is not going to be helpful in the long run. I could get behind a patch that split remoteVersion into sourceVersion and targetVersion and made sure that all the existing references to remoteVersion were updated to the correct one of those. After that we could do something like what you have here in a less shaky fashion. As Robert noted, there'd probably still be a long way to go before it'd really work to use a newer pg_dump with an older target version, but at least we'd not be building on quicksand. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On 9/27/16, Tom Lane wrote: > Robert Haas writes: >> On Mon, Sep 26, 2016 at 9:56 PM, Vitaly Burovoy >> wrote: >>> We do dump/restore schemas/data via custom/dir formats and we have to >>> keep several client versions for 9.2, 9.4 and 9.5 versions on local >>> workstations because after pg_restore95 connects to 9.2, it fails when >>> it sets run-time parameters via SET: > >> I think our policy is that a newer pg_dump needs to work with an older >> version of the database, but I don't think we make any similar >> guarantee for other tools, such as pg_restore. It's an interesting >> question whether we should try a little harder to do that, but I >> suspect it might require more changes than what you've got in this >> patch Well... I'm not inclined to insert support of restoring from a higher major version to a lower one, because it can lead to security issues (e.g. with RLS). But my observation is that all new features supported by the "pg_dump" are "incremental", e.g. the last feature "parallel" for pg_dump/pg_restore --- lack of "PARALLEL UNSAFE" (which is by default) from 9.6 and lack of it from pre-9.6. That behavior allows newer versions of pg_restore to use dumps from DB of older versions because of lack of new features grammar. With the patch I'm able to use pg_dump96/pg_restore96 for our database of 9.2 (dump from 9.2 and restore to 9.2). > The general policy has always been that pg_dump output is only expected to > restore without errors into a server that's the same or newer version as > pg_dump (regardless of the server version being dumped from). If you try > to restore into an older server version, you may get some errors, which we > try to minimize the scope of when possible. But it will never be possible > to promise none at all. I think the correct advice here is simply "don't > use pg_restore -e -1 when trying to restore into an older server version". Why can't I use it if pg_dump92/pg_restore92 have the same behavior as pg_dump96/pg_restore96 except the SET block? The patch does not give guarantee of a restoration, it just avoids setting unsupported parameters for pg_restore the same way as pg_dump does. The other issues are for solving by the user who wants to restore to a DB of older version. > Taking a step back, there are any number of ways that you might get > errors during a pg_restore into a DB that's not set up exactly as pg_dump > expected. Missing roles or tablespaces, for example. It does not depends on a pg_dump/pg_restore version and can be soved by using command line options: --no-tablespaces --no-owner --no-privileges > Again, the dump output is constructed so that you can survive those problems > and bull ahead ... but not with "-e -1". I think "-e -1" was invented specially for it --- stop restoring if something is going wrong. Wasn't it? > I don't see a very good reason why > older-server-version shouldn't be considered the same type of problem. > > The patch as given seems rather broken anyway --- won't it change text > output from pg_dump as well as on-line output from pg_restore? My opinion --- no (and I wrote it in the initial letter): because it is impossible to know what version of a database is used for that plain text output. Users who use output to a plain text are able to use sed (or something else) to delete unwanted rows. > (That is, > it looks to me like the SETs emitted by pg_dump to text format would > depend on the source server version, which they absolutely should not. > Either that, or the patch is overwriting pg_dump's idea of what the > source server version is at the start of the output phase, which is > likely to break all kinds of stuff when dumping from an older server.) I agree, that's why I left current behavior as is for the plain text output. > It's possible that we could alleviate this specific symptom by arranging > for the BEGIN caused by "-1" to come out after the initial SETs, but > I'm not sure how complicated that would be or whether it's worth the > trouble. It leads to change ERRORs to WARNINGs but current behavior discussed above is left the same. Why don't just avoid SET parameters when we know for sure they are not supported by the server to which pg_restore is connected? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
Robert Haas writes: > On Mon, Sep 26, 2016 at 9:56 PM, Vitaly Burovoy > wrote: >> We do dump/restore schemas/data via custom/dir formats and we have to >> keep several client versions for 9.2, 9.4 and 9.5 versions on local >> workstations because after pg_restore95 connects to 9.2, it fails when >> it sets run-time parameters via SET: > I think our policy is that a newer pg_dump needs to work with an older > version of the database, but I don't think we make any similar > guarantee for other tools, such as pg_restore. It's an interesting > question whether we should try a little harder to do that, but I > suspect it might require more changes than what you've got in this > patch The general policy has always been that pg_dump output is only expected to restore without errors into a server that's the same or newer version as pg_dump (regardless of the server version being dumped from). If you try to restore into an older server version, you may get some errors, which we try to minimize the scope of when possible. But it will never be possible to promise none at all. I think the correct advice here is simply "don't use pg_restore -e -1 when trying to restore into an older server version". Taking a step back, there are any number of ways that you might get errors during a pg_restore into a DB that's not set up exactly as pg_dump expected. Missing roles or tablespaces, for example. Again, the dump output is constructed so that you can survive those problems and bull ahead ... but not with "-e -1". I don't see a very good reason why older-server-version shouldn't be considered the same type of problem. The patch as given seems rather broken anyway --- won't it change text output from pg_dump as well as on-line output from pg_restore? (That is, it looks to me like the SETs emitted by pg_dump to text format would depend on the source server version, which they absolutely should not. Either that, or the patch is overwriting pg_dump's idea of what the source server version is at the start of the output phase, which is likely to break all kinds of stuff when dumping from an older server.) It's possible that we could alleviate this specific symptom by arranging for the BEGIN caused by "-1" to come out after the initial SETs, but I'm not sure how complicated that would be or whether it's worth the trouble. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On Mon, Sep 26, 2016 at 9:56 PM, Vitaly Burovoy wrote: > At work we use several major versions of PostgreSQL, and developers > use non-local clusters for developing and debugging. > We do dump/restore schemas/data via custom/dir formats and we have to > keep several client versions for 9.2, 9.4 and 9.5 versions on local > workstations because after pg_restore95 connects to 9.2, it fails when > it sets run-time parameters via SET: > > vitaly@work 01:21:26 ~ $ pg_restore95 --host DEV_HOST_9_2 -d DBNAME > --data-only -e -1 -Fc arhcivefile > Password: > pg_restore95: [archiver (db)] Error while INITIALIZING: > pg_restore95: [archiver (db)] could not execute query: ERROR: > unrecognized configuration parameter "lock_timeout" > Command was: SET lock_timeout = 0; I think our policy is that a newer pg_dump needs to work with an older version of the database, but I don't think we make any similar guarantee for other tools, such as pg_restore. It's an interesting question whether we should try a little harder to do that, but I suspect it might require more changes than what you've got in this patch -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers