Re: [HACKERS] pg_basebackup stream xlog to tar
On Sun, Oct 23, 2016 at 10:28 PM, Magnus Haganderwrote: > It also broke the tests and invalidated some documentation. But it was easy > enough to fix. > > I've now applied this, so next time you get to do the merging :P Joking > aside, please review and let me know if you can spot something I messed up > in the final merge. Just had another look at it.. +static int +tar_fsync(Walfile f) +{ + Assert(f != NULL); + tar_clear_error(); + + /* +* Always sync the whole tarfile, because that's all we can do. This makes +* no sense on compressed files, so just ignore those. +*/ + if (tar_data->compression) + return 0; + + return fsync(tar_data->fd); +} fsync() should not be called here if --no-sync is used. + /* sync the empty blocks as well, since they're after the last file */ + fsync(tar_data->fd); Similarly, the fsync call of tar_finish() should not happen when --no-sync is used. + if (format == 'p') + stream.walmethod = CreateWalDirectoryMethod(param->xlog, do_sync); + else + stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel, do_sync); LogStreamerMain() exits immediately once it is done, but I think that we had better be tidy here and clean up the WAL methods that have been allocated. I am thinking here about a potentially retry method on failure, though the best shot in this area would be with ReceiveXlogStream(). Attached is a patch addressing those points. -- Michael pg_basebackup-tar-fixes.patch Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] issue with track_commit_timestamp and server restart
On 24 October 2016 at 12:58, Craig Ringerwrote: > The attached patch adds a TAP test to src/test/recovery to show this. Added to CF. https://commitfest.postgresql.org/11/834/ ... but IMO this should go in the next bugfix release. I've also applied nearly the same fix for the same bug in the original commit timestamp implementation in the BDR Postgres tree. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] issue with track_commit_timestamp and server restart
(thread hijacking) On Mon, Oct 24, 2016 at 1:58 PM, Craig Ringerwrote: > To save time running the recovery suite, just > >rm src/test/recovery/00[0-8]*.pl > > (It'd be nice to have a prove_check target to run just one test file). Agreed! Or multiple chosen files. I usually remove those files manually from time to time when working on a fix just to run one dedicated test repeatedly. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] issue with track_commit_timestamp and server restart
On 22 October 2016 at 19:51, Julien Rouhaudwrote: > I just noticed that if track_commit_timestamp is enabled, the > oldestCommitTsXid and newestCommitTsXid don't persist after a server > restart, so you can't ask for the commit ts of a transaction that > committed before the last server start, although the information is > still available (unless of course if a freeze occured). AFAICT it > always behave this way. I initially could'n't see this when tested on 8f1fb7d with a src/test/recovery/t test script. But it turns out it's OK on immediate shutdown and crash recovery, but not on clean shutdown and restart. The attached patch adds a TAP test to src/test/recovery to show this. If you run the TAP test before recompiling with the fix it'll fail. "make" to apply the fix, then re-run and it'll succeed. Or just temporarily roll back the fix with: git checkout HEAD^1 -- src/backend/access/transam/commit_ts.c git reset src/backend/access/transam/commit_ts.c and rebuild to see it fail. To save time running the recovery suite, just rm src/test/recovery/00[0-8]*.pl (It'd be nice to have a prove_check target to run just one test file). This would explain a few issues I've seen reported with BDR from the community which I've so far been unable to reproduce, so thanks very much for the report. Álvaro, would you mind checking this and committing to HEAD and 9.6? The commits.c change only should also be committed to 9.5, but not the TAP test. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 485ecc6c579073cf71ac88746c53668733bc5cac Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Mon, 24 Oct 2016 12:16:41 +0800 Subject: [PATCH] Preserve access to commit timestamps across restart An oversight in setting the boundaries of known commit timestamps during startup caused old commit timestamps to become inaccessible after a server restart. Report and bugfix by Julien Rouhaud Review and commit timestamp tests by Craig Ringer --- src/backend/access/transam/commit_ts.c | 2 + src/test/recovery/t/009_committs.pl| 103 + 2 files changed, 105 insertions(+) create mode 100644 src/test/recovery/t/009_committs.pl diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c index a8d275f..7746578 100644 --- a/src/backend/access/transam/commit_ts.c +++ b/src/backend/access/transam/commit_ts.c @@ -844,6 +844,8 @@ SetCommitTsLimit(TransactionId oldestXact, TransactionId newestXact) else { Assert(ShmemVariableCache->newestCommitTsXid == InvalidTransactionId); + ShmemVariableCache->oldestCommitTsXid = oldestXact; + ShmemVariableCache->newestCommitTsXid = newestXact; } LWLockRelease(CommitTsLock); } diff --git a/src/test/recovery/t/009_committs.pl b/src/test/recovery/t/009_committs.pl new file mode 100644 index 000..48f8899 --- /dev/null +++ b/src/test/recovery/t/009_committs.pl @@ -0,0 +1,103 @@ +# Testing of logical decoding using SQL interface and/or pg_recvlogical +use strict; +use warnings; +use PostgresNode; +use TestLib; +use Test::More tests => 16; + +my $node_master = get_new_node('master'); +$node_master->init(allows_streaming => 1); +$node_master->append_conf( + 'postgresql.conf', qq( +track_commit_timestamp = on +)); +$node_master->start; + +my ($ret, $stdout, $stderr); + +($ret, $stdout, $stderr) = $node_master->psql('postgres', qq[SELECT pg_xact_commit_timestamp('0');]); +is($ret, 3, 'getting ts of InvalidTransactionId reports error'); +like($stderr, qr/cannot retrieve commit timestamp for transaction/, 'expected error from InvalidTransactionId'); + +($ret, $stdout, $stderr) = $node_master->psql('postgres', qq[SELECT pg_xact_commit_timestamp('1');]); +is($ret, 3, 'getting ts of BootstrapTransactionId reports error'); +like($stderr, qr/cannot retrieve commit timestamp for transaction/, 'expected error from BootstrapTransactionId'); + +($ret, $stdout, $stderr) = $node_master->psql('postgres', qq[SELECT pg_xact_commit_timestamp('2');]); +is($ret, 3, 'getting ts of FrozenTransactionId reports error'); +like($stderr, qr/cannot retrieve commit timestamp for transaction/, 'expected error from FrozenTransactionId'); + +# Since FirstNormalTransactionId will've occurred during initdb, long before we +# enabled commit timestamps, it'll be null since we have no cts data for it but +# cts are enabled. +is($node_master->safe_psql('postgres', qq[SELECT pg_xact_commit_timestamp('3');]), '', + 'committs for FirstNormalTransactionId is null'); + +$node_master->safe_psql('postgres', qq[CREATE TABLE committs_test(x integer, y timestamp with time zone);]); + +my $xid = $node_master->safe_psql('postgres', qq[ + BEGIN; + INSERT INTO committs_test(x, y) VALUES (1, current_timestamp); + SELECT txid_current(); + COMMIT; +]); + +my $before_restart_ts = $node_master->safe_psql('postgres', qq[SELECT
Re: [HACKERS] [COMMITTERS] pgsql: Remove extra comma at end of enum list
(Moved to -hackers) On Sun, Oct 23, 2016 at 10:57 PM, Magnus Haganderwrote: > Remove extra comma at end of enum list > > C99-specific feature, and wasn't intentional in the first place. > > Per buildfarm member mylodon This is the second time I see that in the last couple of weeks. Would it be better to use some custom CFLAGS to detect that earlier in the review process? I have never much used gcc's std or clang extensions in this area. Has somebody any recommendations? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup stream xlog to tar
On Mon, Oct 24, 2016 at 1:38 PM, Andres Freundwrote: > On 2016-10-17 14:37:05 +0900, Michael Paquier wrote: >> 2) Add an option to pg_xlogdump to be able to output its output to a >> file. That would be awkward to rely on grabbing the output data from a >> pipe... On Windows particularly. Thinking about it, would that >> actually be useful to others? That's not a complicated patch. > > Hm? Just redirecting output seems less complicated? And afaik works on > windows as well? In the TAP suite STDOUT is already redirect to the log files. Perhaps we could just do a SELECT FILE; to redirect the output of pg_xlogdump temporarily into a custom location just for the sake of a test like that. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails
On Mon, Oct 24, 2016 at 9:18 AM, Kyotaro HORIGUCHIwrote: > Thank you for looking and retelling this. > > At Fri, 21 Oct 2016 13:02:21 -0400, Robert Haas wrote > in >> On Sun, Oct 2, 2016 at 8:52 AM, Michael Paquier >> wrote: >> >> So, if I understand correctly, then we can mark the version posted by >> >> you upthread [1] which includes a test along with Kyotaro's fix can be >> >> marked as Ready for committer. If so, then please change the status >> >> of patch accordingly. >> > >> > Patch moved to next CF 2016-11, still with status "ready for committer". >> > >> > IMO, this thread has reached as conclusion to use this patch to >> > address the problem reported: >> > cab7npqtv5gmkqcndofgtgqoqxz2xlz4rrw247oqojzztvy6...@mail.gmail.com >> >> I spent some time reading this thread today and trying to understand >> exactly what's going on here. Here's my attempt to restate the >> problem: >> >> 1. If a restartpoint flushes no dirty buffers, then the redo location >> advances but the minimum recovery point does not; therefore, the >> minimum recovery point can fall behind the redo location. That's >> correct behavior, because if the standby is subsequently restarted, it >> can correctly begin at the checkpoint record associated with the >> restartpoint and need not replay any further. > > Yes. > >> 2. However, it causes a problem when a base backup with the "include >> WAL" option is taken from a standby because -- on standby servers only >> -- the end location for a backup as returned by do_pg_stop_backup() is >> the minimum recovery point. If this precedes the start point for the >> backup -- which is the restart point -- perform_base_backup() will >> conclude that no WAL files are required for the backup and fail an >> internal sanity check. > > Yes. The option implies that the copied $PGDATA is a standalone > backup, that is, usable to start a standby instantly so at least > one WAL segment is needed, as mentioned in 3 below. > >> 3. Removing the sanity check wouldn't help, because it can never be >> right to end up with zero WAL files as a result of a base backup. At >> a minimum, we need the WAL file which contains the checkpoint record >> from which the control file specifies that redo should start. >> Actually, I think that checkpoint record could be spread across >> multiple files: it might be big. We need them all, but the current >> code won't ensure that we get them. > > Yes. The "checkpoit records" means that records during a checkpoint. > >> The consensus solution on this thread seems to be that we should have >> pg_do_stop_backup() return the last-replayed XLOG location as the >> backup end point. If the control file has been updated with a newer >> redo location, then the associated checkpoint record has surely been >> replayed, so we'll definitely have enough WAL to replay that >> checkpoint record (and we don't need to replay anything else, because >> we're supposing that this is the case where the minimum recovery point >> precedes the redo location). Although this will work, it might >> include more WAL in the backup than is strictly necessary. If the >> additional WAL replayed beyond the minimum recovery point does NOT >> include a checkpoint, we don't need any of it; if it does, we need >> only the portion up to and including the last checkpoint record, and >> not anything beyond that. > > StartupXLOG mandates the existence of the start record of the > latest checkpoint. pg_start_backup() returns the start point > (redo location) of the restartpoint that it requested and > minRecoveryPoint is always after the replpayed checkpoint so it > works if always was going well. > > But if the checkpoint contains no update, the restart point > exceeds minRecoveryPoint. Then no WAL copied. > >> I can think of two solutions that would be "tighter": >> >> 1. When performing a restartpoint, update the minimum recovery point >> to just beyond the checkpoint record. I think this can't hurt anyone >> who is actually restarting recovery on the same machine, because >> that's exactly the point where they are going to start recovery. A >> minimum recovery point which precedes the point at which they will >> start recovery is no better than one which is equal to the point at >> which they will start recovery. > > This is a candidate that I thought of. But I avoided to change > the behavior of minRecoveryPoint that (seems to me that) it > describes only buffer state. So checkpoint with no update doesn't > advances minRecoveryPoint as my understanding. > I think what you are saying is not completely right, because we do update minRecoveryPoint when we don't perform a new restart point. When we perform restart point, then it assumes that flushing the buffers will take care of updating minRecoveryPoint. -- With Regards, Amit Kapila. EnterpriseDB:
Re: [HACKERS] pg_basebackup stream xlog to tar
On 2016-10-17 14:37:05 +0900, Michael Paquier wrote: > 2) Add an option to pg_xlogdump to be able to output its output to a > file. That would be awkward to rely on grabbing the output data from a > pipe... On Windows particularly. Thinking about it, would that > actually be useful to others? That's not a complicated patch. Hm? Just redirecting output seems less complicated? And afaik works on windows as well? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup stream xlog to tar
On Sun, Oct 23, 2016 at 10:52 PM, Michael Paquierwrote: > On Sun, Oct 23, 2016 at 10:30 PM, Magnus Hagander wrote: >> I think both of those would be worthwhile. Just for the testability in >> itself, but such a flag to pg_xlogdump would probably be useful in other >> cases as well, beyond just the testing. > > Looking quickly at the code, it does not seem that complicated... I > may just send patches tomorrow for all those things and be done with > it, all that on its new dedicated thread. Actually not that much after noticing that pg_xlogdump emulates some of the backend's StringInfo routines and calls at the end vprintf() to output everything to stdout, which is ugly. The cleanest solution here would be to make StringInfo a bit more portable and allow them for frontends, somehting that may be useful for any utility playing with rm_desc. A less cleaner solution would be to somewhat store a fd pointing to a file (or stdout) into compat.c and output to it. I'd slightly prefer the first solution, but that does not seem worth the effort just for pg_xlogdump and one test. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails
On Fri, Oct 21, 2016 at 10:32 PM, Robert Haaswrote: > On Sun, Oct 2, 2016 at 8:52 AM, Michael Paquier > wrote: >>> So, if I understand correctly, then we can mark the version posted by >>> you upthread [1] which includes a test along with Kyotaro's fix can be >>> marked as Ready for committer. If so, then please change the status >>> of patch accordingly. >> >> Patch moved to next CF 2016-11, still with status "ready for committer". >> >> IMO, this thread has reached as conclusion to use this patch to >> address the problem reported: >> cab7npqtv5gmkqcndofgtgqoqxz2xlz4rrw247oqojzztvy6...@mail.gmail.com > > I spent some time reading this thread today and trying to understand > exactly what's going on here. Here's my attempt to restate the > problem: > > 1. If a restartpoint flushes no dirty buffers, then the redo location > advances but the minimum recovery point does not; therefore, the > minimum recovery point can fall behind the redo location. That's > correct behavior, because if the standby is subsequently restarted, it > can correctly begin at the checkpoint record associated with the > restartpoint and need not replay any further. > > 2. However, it causes a problem when a base backup with the "include > WAL" option is taken from a standby because -- on standby servers only > -- the end location for a backup as returned by do_pg_stop_backup() is > the minimum recovery point. If this precedes the start point for the > backup -- which is the restart point -- perform_base_backup() will > conclude that no WAL files are required for the backup and fail an > internal sanity check. > > 3. Removing the sanity check wouldn't help, because it can never be > right to end up with zero WAL files as a result of a base backup. At > a minimum, we need the WAL file which contains the checkpoint record > from which the control file specifies that redo should start. > Actually, I think that checkpoint record could be spread across > multiple files: it might be big. We need them all, but the current > code won't ensure that we get them. > > The consensus solution on this thread seems to be that we should have > pg_do_stop_backup() return the last-replayed XLOG location as the > backup end point. If the control file has been updated with a newer > redo location, then the associated checkpoint record has surely been > replayed, so we'll definitely have enough WAL to replay that > checkpoint record (and we don't need to replay anything else, because > we're supposing that this is the case where the minimum recovery point > precedes the redo location). Although this will work, it might > include more WAL in the backup than is strictly necessary. If the > additional WAL replayed beyond the minimum recovery point does NOT > include a checkpoint, we don't need any of it; if it does, we need > only the portion up to and including the last checkpoint record, and > not anything beyond that. > You are right that it will include additional WAL than strictly necessary, but that can happen today as well because minrecoverypoint could be updated after you have established restart point for do_pg_start_backup(). Do you want to say that even without patch in some cases we are including additional WAL in backup than what is strictly necessary, so it is better to improve the situation? > I can think of two solutions that would be "tighter": > > 1. When performing a restartpoint, update the minimum recovery point > to just beyond the checkpoint record. I think this can't hurt anyone > who is actually restarting recovery on the same machine, because > that's exactly the point where they are going to start recovery. A > minimum recovery point which precedes the point at which they will > start recovery is no better than one which is equal to the point at > which they will start recovery. > I think this will work but will cause an unnecessary control file update for the cases when buffer flush will anyway do it. It can work if we try to do only when required (minrecoverypoint is lesser than lastcheckpoint) after flush of buffers. > 2. In perform_base_backup(), if the endptr returned by > do_pg_stop_backup() precedes the end of the checkpoint record returned > by do_pg_start_backup(), use the latter value instead. Unfortunately, > that's not so easy: we can't just say if (endptr < starptr) startptr = > endptr; because startptr is the *start* of the checkpoint record, not > the end. I suspect somebody could figure out a solution to this > problem, though. > With this approach, don't we need something similar for API's pg_stop_backup()/pg_stop_backup_v2()? > If we decide we don't want to aim for one of these tighter solutions > and just adopt the previously-discussed patch, then at the very least > it needs better comments. > +1. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list
Re: [HACKERS] FSM corruption leading to errors
On Sat, Oct 22, 2016 at 7:31 AM, Michael Paquierwrote: > On Sat, Oct 22, 2016 at 5:17 AM, Jim Nasby wrote: >> On 10/20/16 10:15 PM, Michael Paquier wrote: >>> >>> 2) If anything is found, stop the server and delete the files manually. >>> 3) Re-start the server. >>> OK, that's troublesome and costly for large relations, but we know >>> that's the safest way to go for any versions, and there is no need to >>> complicate the code with any one-time repairing extensions. >> >> Wouldn't you need to run around to all your replicas and do that as well? > > Yeah... Release notes refer to this page for methods to fix corrupted instances: https://wiki.postgresql.org/wiki/Free_Space_Map_Problems I just typed something based on Pavan's upper method, feel free to jump in and improve it as necessary. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails
Thank you for looking and retelling this. At Fri, 21 Oct 2016 13:02:21 -0400, Robert Haaswrote in > On Sun, Oct 2, 2016 at 8:52 AM, Michael Paquier > wrote: > >> So, if I understand correctly, then we can mark the version posted by > >> you upthread [1] which includes a test along with Kyotaro's fix can be > >> marked as Ready for committer. If so, then please change the status > >> of patch accordingly. > > > > Patch moved to next CF 2016-11, still with status "ready for committer". > > > > IMO, this thread has reached as conclusion to use this patch to > > address the problem reported: > > cab7npqtv5gmkqcndofgtgqoqxz2xlz4rrw247oqojzztvy6...@mail.gmail.com > > I spent some time reading this thread today and trying to understand > exactly what's going on here. Here's my attempt to restate the > problem: > > 1. If a restartpoint flushes no dirty buffers, then the redo location > advances but the minimum recovery point does not; therefore, the > minimum recovery point can fall behind the redo location. That's > correct behavior, because if the standby is subsequently restarted, it > can correctly begin at the checkpoint record associated with the > restartpoint and need not replay any further. Yes. > 2. However, it causes a problem when a base backup with the "include > WAL" option is taken from a standby because -- on standby servers only > -- the end location for a backup as returned by do_pg_stop_backup() is > the minimum recovery point. If this precedes the start point for the > backup -- which is the restart point -- perform_base_backup() will > conclude that no WAL files are required for the backup and fail an > internal sanity check. Yes. The option implies that the copied $PGDATA is a standalone backup, that is, usable to start a standby instantly so at least one WAL segment is needed, as mentioned in 3 below. > 3. Removing the sanity check wouldn't help, because it can never be > right to end up with zero WAL files as a result of a base backup. At > a minimum, we need the WAL file which contains the checkpoint record > from which the control file specifies that redo should start. > Actually, I think that checkpoint record could be spread across > multiple files: it might be big. We need them all, but the current > code won't ensure that we get them. Yes. The "checkpoit records" means that records during a checkpoint. > The consensus solution on this thread seems to be that we should have > pg_do_stop_backup() return the last-replayed XLOG location as the > backup end point. If the control file has been updated with a newer > redo location, then the associated checkpoint record has surely been > replayed, so we'll definitely have enough WAL to replay that > checkpoint record (and we don't need to replay anything else, because > we're supposing that this is the case where the minimum recovery point > precedes the redo location). Although this will work, it might > include more WAL in the backup than is strictly necessary. If the > additional WAL replayed beyond the minimum recovery point does NOT > include a checkpoint, we don't need any of it; if it does, we need > only the portion up to and including the last checkpoint record, and > not anything beyond that. StartupXLOG mandates the existence of the start record of the latest checkpoint. pg_start_backup() returns the start point (redo location) of the restartpoint that it requested and minRecoveryPoint is always after the replpayed checkpoint so it works if always was going well. But if the checkpoint contains no update, the restart point exceeds minRecoveryPoint. Then no WAL copied. > I can think of two solutions that would be "tighter": > > 1. When performing a restartpoint, update the minimum recovery point > to just beyond the checkpoint record. I think this can't hurt anyone > who is actually restarting recovery on the same machine, because > that's exactly the point where they are going to start recovery. A > minimum recovery point which precedes the point at which they will > start recovery is no better than one which is equal to the point at > which they will start recovery. This is a candidate that I thought of. But I avoided to change the behavior of minRecoveryPoint that (seems to me that) it describes only buffer state. So checkpoint with no update doesn't advances minRecoveryPoint as my understanding. > 2. In perform_base_backup(), if the endptr returned by > do_pg_stop_backup() precedes the end of the checkpoint record returned > by do_pg_start_backup(), use the latter value instead. Unfortunately, > that's not so easy: we can't just say if (endptr < starptr) startptr = > endptr; because startptr is the *start* of the checkpoint record, not > the end. I suspect somebody could figure out a solution to this > problem, though. Yes, and the reason I rejected it was that it is not logical,
[HACKERS] Add radiustimeout parameter for RADIUS HBA
Hello everyone, I’d like to submit the attached patch for feedback from the PostgreSQL community and potential future inclusion in the codebase. The patch adds a new parameter to the RADIUS authentication method named “radiustimeout”, allowing the database administrator to configure the timeout in seconds to wait for responses from a configured RADIUS server. Until now, this has been hardcoded to three seconds by the RADIUS_TIMEOUT define in auth.c. While this is usually sufficient for typical RADIUS server configurations, there are some more unusual configurations where a higher timeout is required. Examples include: - Authenticating against a RADIUS server over a high latency link - Authenticating against a RADIUS server that is performing additional out-of-band authentication The latter case is applicable to a server I admin and spurred the development of this patch. We implemented multi-factor authentication for user access to a sensitive database via a RADIUS server implementation which performs the standard username & password verification, and if it succeeds, subsequently performs a second factor of authentication via a configured mobile app. The RADIUS response confirming successful authentication is only returned after both authentication factors have completed. In our deployment, a timeout of 60 seconds seems to work well, but certainly three seconds is not at all workable. Thanks in advance for any and all feedback. Kind regards, -SDL radiustimeout.patch Description: radiustimeout.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Minor code improvement to postgresGetForeignJoinPaths
Hi, The last argument of create_foreignscan_path called by postgresGetForeignJoinPaths is set to NULL, but it would be suitable to set it to NIL because the argument type is List. Please find attached a patch. Tatsuro Yamada NTT Open Source Software Center diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index daf0438..a5de611 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -4331,7 +4331,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root, NIL, /* no pathkeys */ NULL, /* no required_outer */ epq_path, - NULL); /* no fdw_private */ + NIL); /* no fdw_private */ /* Add generated path into joinrel by add_path(). */ add_path(joinrel, (Path *) joinpath); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LLVM Address Sanitizer (ASAN) and valgrind support
On Thu, Oct 20, 2016 at 06:14:28PM +0100, Greg Stark wrote: > On Oct 20, 2016 5:27 PM, "Noah Misch"wrote: > > On Wed, Oct 19, 2016 at 11:08:39AM +0100, Greg Stark wrote: > > > The MEMPOOL_FREE doesn't take any size argument and mcxt.c doesn't > > > have convenient access to a size argument. It could call the > > > GetChunkSpace method but that will include the allocation overhead and > > > > That is indeed a problem for making VALGRIND_MEMPOOL_FREE() an alias of > > VALGRIND_MAKE_MEM_NOACCESS() under ASAN as I suggested. Calling > > GetMemoryChunkSpace() in the macro would cause memdebug.h to embed an > > assumption of mcxt.c, which is messy. Including the allocation overhead is > > fine, though. > > I think the way out is to simply have aset.c make the memory > undefined/noaccess even if it's redundant under valgrind. It's a bit > unfortunate that the macros would have different semantics under the two. > If it's too confusing or we're worried about the performance overhead we > could make a MAKE_MEM_{UNDEFINED,NOACCESS}_IF_NO_MEMPOOL() but I don't > think it's worth it myself. I don't expect much performance overhead. When I last benchmarked Valgrind Memcheck of "make installcheck", a !USE_VALGRIND build (no client requests at all) saved about 5% of runtime. A single new client request should be cheap enough. (Marking otherwise-redundant calls may still be good, though.) > There are a couple build oddities both with gcc and clang before I can > commit anything though. And I can't test valgrind to be sure the redundant > calls aren't causing a problem. When you submit your patch to a CommitFest, mention that you're blocked on having a reviewer who can test Valgrind. Many reviewers can help. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On Sun, Oct 23, 2016 at 2:46 PM, Tom Lanewrote: > What's bothering me is (a) it's less than 24 hours to release wrap and > (b) this patch changes SSI-relevant behavior and hasn't been approved > by Kevin. I'm not familiar enough with that logic to commit a change > in it on my own authority, especially with so little time for problems > to be uncovered. I should point out that I knew that the next set of point releases had been moved forward much later than you did. I had to work on this fix during the week, which was pretty far from ideal for me for my own reasons. > I'm okay with adding an explicit buffer lock/unlock pair, and in fact > plan to go have a look at that in a bit. I'm not okay with doing a > refactoring that might change the behavior in more ways than that > under these circumstances. Fair enough. As long as we do that much, I'm comfortable. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
Peter Geogheganwrites: > On Sun, Oct 23, 2016 at 1:42 PM, Tom Lane wrote: >> "Rarely" is not "never". A bigger problem though is that heap_fetch >> does more than just lock the buffer: there are also PredicateLockTuple >> and CheckForSerializableConflictOut calls in there. It's possible that >> those are no-ops in this usage (because after all we already fetched >> the tuple once), or maybe they're even desirable because they would help >> resolve Kevin's concerns. But it hasn't been analyzed and so I'm not >> prepared to risk a behavioral change in this already under-tested area >> the day before a release wrap. > I'm surprised at how you've assessed the risk here. What's bothering me is (a) it's less than 24 hours to release wrap and (b) this patch changes SSI-relevant behavior and hasn't been approved by Kevin. I'm not familiar enough with that logic to commit a change in it on my own authority, especially with so little time for problems to be uncovered. I'm okay with adding an explicit buffer lock/unlock pair, and in fact plan to go have a look at that in a bit. I'm not okay with doing a refactoring that might change the behavior in more ways than that under these circumstances. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
On Sun, Oct 23, 2016 at 1:42 PM, Tom Lanewrote: > "Rarely" is not "never". A bigger problem though is that heap_fetch > does more than just lock the buffer: there are also PredicateLockTuple > and CheckForSerializableConflictOut calls in there. It's possible that > those are no-ops in this usage (because after all we already fetched > the tuple once), or maybe they're even desirable because they would help > resolve Kevin's concerns. But it hasn't been analyzed and so I'm not > prepared to risk a behavioral change in this already under-tested area > the day before a release wrap. The heap_fetch() contract doesn't ask its callers to consider any of that. Besides, those actions (PredicateLockTuple + CheckForSerializableConflictOut) are very probably redundant no-ops as you say. (They won't help with Kevin's additional concerns, BTW, because Kevin is concerned about excessive serialization failures with SSI.) > AFAICT, it's very hard to get to an actual failure from the lack of > buffer lock here. You would need to have a situation where the tuple > was not hintable when originally fetched but has become so by the time > ON CONFLICT is rechecking it. That's possible, eg if we're using > async commit and the previous transaction's commit record has gotten > flushed to disk in between, but it's not likely. Even then, no actual > problem would ensue (in non-assert builds) from setting a hint bit without > buffer lock, except in very hard-to-hit race conditions. So the buffer > lock issue doesn't seem urgent enough to me to justify making a fix under > time pressure. > > The business about not throwing a serialization failure due to actions > of our own xact does seem worth fixing now, but if I understand correctly, > we can deal with that by adding a test for xmin-is-our-own-xact into > the existing code structure. I propose doing that much (and adding > Munro's regression test case) and calling it good for today. I'm surprised at how you've assessed the risk here. I think that the risk of not proceeding with fixing the buffer lock issue is greater than the risk of breaking something else with the proposed fix. Even if the former risk isn't such a big one. If there are a non-trivial number of users that are particularly attached to the precise behavior of higher isolation levels in master (assuming it would be altered by the proposed fix), then it's surprising that it took so many months for someone to complain about the ON CONFLICT DO NOTHING behavior being blatantly broken. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On conflict update & hint bits
Peter Geogheganwrites: > On Sat, Oct 22, 2016 at 9:38 AM, Tom Lane wrote: >> 1. Why are you replacing ExecOnConflictUpdate's ExecCheckHeapTupleVisible >> call with ExecCheckTIDVisible? That appears to demand a re-fetch of the >> tuple ExecOnConflictUpdate already has its hands on. Wouldn't it be >> better to just put a buffer lock/unlock around the existing code? > I thought that that was less than idiomatic within nodeModifyTable.c > -- executor modules rarely directly acquire buffer locks. "Rarely" is not "never". A bigger problem though is that heap_fetch does more than just lock the buffer: there are also PredicateLockTuple and CheckForSerializableConflictOut calls in there. It's possible that those are no-ops in this usage (because after all we already fetched the tuple once), or maybe they're even desirable because they would help resolve Kevin's concerns. But it hasn't been analyzed and so I'm not prepared to risk a behavioral change in this already under-tested area the day before a release wrap. AFAICT, it's very hard to get to an actual failure from the lack of buffer lock here. You would need to have a situation where the tuple was not hintable when originally fetched but has become so by the time ON CONFLICT is rechecking it. That's possible, eg if we're using async commit and the previous transaction's commit record has gotten flushed to disk in between, but it's not likely. Even then, no actual problem would ensue (in non-assert builds) from setting a hint bit without buffer lock, except in very hard-to-hit race conditions. So the buffer lock issue doesn't seem urgent enough to me to justify making a fix under time pressure. The business about not throwing a serialization failure due to actions of our own xact does seem worth fixing now, but if I understand correctly, we can deal with that by adding a test for xmin-is-our-own-xact into the existing code structure. I propose doing that much (and adding Munro's regression test case) and calling it good for today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoiding pin scan during btree vacuum
Robert Haas wrote: > On Wed, Oct 19, 2016 at 6:30 PM, Alvaro Herrera >wrote: > > Robert Haas wrote: > >> On Mon, Jan 4, 2016 at 10:30 AM, Tom Lane wrote: > >> >> This seems like a might subtle thing to backpatch. If we really want to > >> >> go there, ISTM that the relevant code should stew in an unreleased > >> >> branch for a while, before being backpatched. > >> > > >> > I'm definitely -1 on back-patching such a thing. Put it in HEAD for > >> > awhile. If it survives six months or so then we could discuss it again. > >> > >> I agree with Tom. > > > > Okay, several months have passed with this in the development branch and > > now seems a good time to backpatch this all the way back to 9.4. > > > > Any objections? > > Although the code has now been in the tree for six months, it's only > been in a released branch for three weeks, which is something about > which to think. The objection above was "stew in an unreleased branch", which it has. > I guess what's really bothering me is that I don't think this is a bug > fix. It seems like a performance optimization. And I think that we > generally have a policy that we don't back-patch performance > optimizations. Of course, there is some fuzziness because when the > performance gets sufficiently bad, we sometimes decide that it amounts > to a bug, as in 73cc7d3b240e1d46b1996382e5735a820f8bc3f7. Maybe this > case qualifies for similar treatment, but I'm not sure. I have seen a number of situations in which the standby strangely lags behind seemingly randomly sometimes for very long periods of time, without any apparent cause, without any possible remedy, and it makes the standby unusable. If the user happens to monitor the lag they may notice it and some may decide not to run certain queries. In other cases the I/O load is so bad that queries that otherwise run without a hitch are stuck for long. In my opinion, this is a serious problem. > Why are you talking about back-patching this to 9.4 rather than all > supported branches? As far as I understand this is dependent on catalog scans being MVCC, so it cannot be backpatched any further than that. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlog error on the master
On Sunday, 23 October 2016, Michael Paquierwrote: > On Sun, Oct 23, 2016 at 5:05 PM, Venkata B Nagothi > wrote: > > I just did did a "git pull" to test one of my patches and i get the > > following error : > > > > 2016-10-23 18:51:47.679 AEDT [31930] FATAL: could not open archive > status > > directory "pg_xlog/archive_status": No such file or directory > > 2016-10-23 18:51:47.679 AEDT [31841] LOG: archiver process (PID 31930) > > exited with exit code 1 > > > > is it expected ? > > No. > > > I notice that pg_xlog's name has been changed to pg_wal. I am not sure > about > > this. > > WAL archiving works correctly here, and tests in src/test/recovery/ > work. Are you sure that you cleaned up up your source tree before > recompiling? Oops. My bad. All works fine now. Sorry for the confusion. Thanks, Venkata B N
[HACKERS] Assertion failures due to testing visibility without buffer lock
Looking at the issue raised in https://www.postgresql.org/message-id/flat/57EE93C8.8080504%40postgrespro.ru prompted me to wonder whether there were any other places in which we were calling tqual.c routines without holding buffer content lock. I tried adding Assert(BufferIsLocal(buffer) || LWLockHeldByMe(BufferDescriptorGetContentLock(GetBufferDescriptor(buffer - 1; to all the tqual.c tuple-visibility-test routines, and sure enough a bunch of regression tests fell over, as a result of this bit in RI_FKey_check: /* * We should not even consider checking the row if it is no longer valid, * since it was either deleted (so the deferred check should be skipped) * or updated (in which case only the latest version of the row should be * checked). Test its liveness according to SnapshotSelf. * * NOTE: The normal coding rule is that one must acquire the buffer * content lock to call HeapTupleSatisfiesVisibility. We can skip that * here because we know that AfterTriggerExecute just fetched the tuple * successfully, so there cannot be a VACUUM compaction in progress on the * page (either heap_fetch would have waited for the VACUUM, or the * VACUUM's LockBufferForCleanup would be waiting for us to drop pin). And * since this is a row inserted by our open transaction, no one else can * be entitled to change its xmin/xmax. */ Assert(new_row_buf != InvalidBuffer); if (!HeapTupleSatisfiesVisibility(new_row, SnapshotSelf, new_row_buf)) return PointerGetDatum(NULL); Now, you'll notice that that bit of argumentation fails to cover the question of hint bits. It might still be okay, if HeapTupleSatisfiesSelf would never attempt to set a hint bit on a row inserted by our own open transaction ... but a quick perusal of that function shows that it will attempt to do so, if xmax points to an aborted subtransaction. That led me to try this test case: create table pp(f1 int primary key); insert into pp values(1); create table cc(f1 int references pp deferrable initially deferred); begin; insert into cc values(1); savepoint x; delete from cc; rollback to x; commit; and sure enough, that dies with an assertion failure in MarkBufferDirtyHint, or predecessor code, in every release back to 8.2 or thereabouts. We might be able to band-aid around that by not attempting to set the hint bit in the "deleting subtransaction must have aborted" path in HeapTupleSatisfiesSelf --- but it's easy to think of cases where not doing so would be a net loss performance-wise. And TBH that code in RI_FKey_check seems to me to be cantilevered way too far over the canyon's edge anyway, especially when one compares the cost of one buffer lock cycle to everything else the function is going to do. I think we should just make it take the buffer lock and be done. BTW, I was also able to reproduce the problem complained of in the aforementioned thread, in an existing test in postgres_fdw: #2 0x00808cdd in ExceptionalCondition ( conditionName=, errorType=, fileName=, lineNumber=) at assert.c:54 #3 0x0083e3a4 in HeapTupleSatisfiesMVCC (htup=, snapshot=0x2198b70, buffer=335) at tqual.c:981 #4 0x0060635c in ExecCheckHeapTupleVisible ( estate=, tuple=, buffer=) at nodeModifyTable.c:197 #5 0x0060845f in ExecCheckTIDVisible (node=0x21bc6e0) at nodeModifyTable.c:222 #6 ExecInsert (node=0x21bc6e0) at nodeModifyTable.c:413 #7 ExecModifyTable (node=0x21bc6e0) at nodeModifyTable.c:1496 #8 0x005ebf98 in ExecProcNode (node=0x21bc6e0) at execProcnode.c:396 #9 0x005e99ba in ExecutePlan (queryDesc=0x21b3908, direction=, count=0) at execMain.c:1567 #10 standard_ExecutorRun (queryDesc=0x21b3908, direction=, count=0) at execMain.c:338 (gdb) p debug_query_string $1 = 0x21b36f8 "INSERT INTO \"S 1\".\"T 1\"(\"C 1\", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) ON CONFLICT DO NOTHING" So we do have coverage of this case, sort of. I don't think I want to propose putting in these Asserts as a permanent test, because it seems pretty ugly to have such code outside bufmgr.c (it required adding #include "storage/buf_internals.h" to tqual.c). On the other hand, if we'd had them we would never have shipped either of these bugs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] condition variables
On Tue, Oct 4, 2016 at 12:12 PM, Thomas Munrowrote: > Here's a rebased patch. ConditionVariableSleep now takes > wait_event_info. Anyone using this in patches for core probably needs > to add enumerators to the WaitEventXXX enums in pgstat.h to describe > their wait points. I think that there are at least 2 patches from EDB people that are already dependent on this one. I myself could easily adopt parallel CREATE INDEX to use it, too. Why the continued delay in committing it? I think it's fairly clear that this mechanism is widely useful. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: two slab-like memory allocators
On 23/10/16 16:26, Tomas Vondra wrote: > On 10/22/2016 08:30 PM, Tomas Vondra wrote: >> On 10/20/2016 04:43 PM, Robert Haas wrote: >>> >>> ... >>> >>> The sb_alloc allocator I proposed a couple of years ago would work >>> well for this case, I think. >>> >> >> Maybe, but it does not follow the Memory Context design at all, if I >> understand it correctly. I was willing to give it a spin anyway and see >> how it compares to the two other allocators, but this is a significant >> paradigm shift and certainly much larger step than what I proposed. >> >> I'm not even sure it's possible to implement a MemoryContext based on >> the same ideas as sb_alloc(), because one of the important points of >> sb_alloc design seems to be throwing away the chunk header. While that >> may be possible, it would certainly affect the whole tree (not just the >> reorderbuffer bit), and it'd require way more work. >> >> Moreover, the two allocators I proposed significantly benefit from the >> "same lifespan" assumption. I don't think sb_alloc can do that. >> > > I've given the sb_alloc patch another try - essentially hacking it into > reorderbuffer, ignoring the issues mentioned yesterday. And yes, it's > faster than the allocators discussed in this thread. Based on a few very > quick tests on my laptop, the difference is usually ~5-10%. > > That might seem like a significant improvement, but it's negligible > compared to the "master -> slab/gen" improvement, which improves > performance by orders of magnitude (at least for the tested cases). > > Moreover, the slab/gen allocators proposed here seem like a better fit > for reorderbuffer, e.g. because they release memory. I haven't looked at > sb_alloc too closely, but I think it behaves more like AllocSet in this > regard (i.e. keeping the memory indefinitely). > For reorderbuffer, from what I've seen in practice, I'd prefer proper freeing to 5% performance gain as I seen walsenders taking GBs of memory dues to reoderbuffer allocations that are never properly freed. About your actual patch. I do like both the Slab and the Gen allocators and think that we should proceed with them for the moment. You definitely need to rename the Gen one (don't ask me to what though) as it sounds like "generic" and do some finishing touches but I think it's the way to go. I don't see any point in GenSlab anymore. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: two slab-like memory allocators
On 10/22/2016 08:30 PM, Tomas Vondra wrote: On 10/20/2016 04:43 PM, Robert Haas wrote: >> ... The sb_alloc allocator I proposed a couple of years ago would work well for this case, I think. Maybe, but it does not follow the Memory Context design at all, if I understand it correctly. I was willing to give it a spin anyway and see how it compares to the two other allocators, but this is a significant paradigm shift and certainly much larger step than what I proposed. I'm not even sure it's possible to implement a MemoryContext based on the same ideas as sb_alloc(), because one of the important points of sb_alloc design seems to be throwing away the chunk header. While that may be possible, it would certainly affect the whole tree (not just the reorderbuffer bit), and it'd require way more work. Moreover, the two allocators I proposed significantly benefit from the "same lifespan" assumption. I don't think sb_alloc can do that. I've given the sb_alloc patch another try - essentially hacking it into reorderbuffer, ignoring the issues mentioned yesterday. And yes, it's faster than the allocators discussed in this thread. Based on a few very quick tests on my laptop, the difference is usually ~5-10%. That might seem like a significant improvement, but it's negligible compared to the "master -> slab/gen" improvement, which improves performance by orders of magnitude (at least for the tested cases). Moreover, the slab/gen allocators proposed here seem like a better fit for reorderbuffer, e.g. because they release memory. I haven't looked at sb_alloc too closely, but I think it behaves more like AllocSet in this regard (i.e. keeping the memory indefinitely). FWIW I'm not making any conclusions about sb_alloc benefits outside reorderbuffer.c - it might easily be worth pursuing, no doubt about that. The amount of remaining work however seems quite high, though. Attached is the modified sb_alloc patch that I used - it's mostly v1 with removed uses in nbtree etc. FWIW the patch does not implement sb_destroy_private_allocator (it's only defined in the header), which seems like a bug. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/contrib/test_freepage/Makefile b/contrib/test_freepage/Makefile new file mode 100644 index 000..b482fe9 --- /dev/null +++ b/contrib/test_freepage/Makefile @@ -0,0 +1,17 @@ +# contrib/test_freepage/Makefile + +MODULES = test_freepage + +EXTENSION = test_freepage +DATA = test_freepage--1.0.sql + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/test_freepage +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/test_freepage/test_freepage--1.0.sql b/contrib/test_freepage/test_freepage--1.0.sql new file mode 100644 index 000..5d3191e --- /dev/null +++ b/contrib/test_freepage/test_freepage--1.0.sql @@ -0,0 +1,15 @@ +/* contrib/test_freepage/test_freepage--1.0.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use "CREATE EXTENSION test_freepage" to load this file. \quit + +CREATE FUNCTION init(size pg_catalog.int8) RETURNS pg_catalog.void + AS 'MODULE_PATHNAME' LANGUAGE C STRICT; +CREATE FUNCTION get(pages pg_catalog.int8) RETURNS pg_catalog.int8 + AS 'MODULE_PATHNAME' LANGUAGE C STRICT; +CREATE FUNCTION inquire_largest() RETURNS pg_catalog.int8 + AS 'MODULE_PATHNAME' LANGUAGE C STRICT; +CREATE FUNCTION put(first_page pg_catalog.int8, npages pg_catalog.int8) + RETURNS pg_catalog.void AS 'MODULE_PATHNAME' LANGUAGE C STRICT; +CREATE FUNCTION dump() RETURNS pg_catalog.text +AS 'MODULE_PATHNAME' LANGUAGE C STRICT; diff --git a/contrib/test_freepage/test_freepage.c b/contrib/test_freepage/test_freepage.c new file mode 100644 index 000..074cf56 --- /dev/null +++ b/contrib/test_freepage/test_freepage.c @@ -0,0 +1,113 @@ +/*-- + * + * test_freepage.c + * Test harness code for free page manager. + * + * Copyright (C) 2013, PostgreSQL Global Development Group + * + * IDENTIFICATION + * contrib/test_freepage/test_freepage.c + * + * - + */ + +#include "postgres.h" + +#include "fmgr.h" +#include "miscadmin.h" +#include "utils/builtins.h" +#include "utils/freepage.h" + +PG_MODULE_MAGIC; +PG_FUNCTION_INFO_V1(init); +PG_FUNCTION_INFO_V1(get); +PG_FUNCTION_INFO_V1(inquire_largest); +PG_FUNCTION_INFO_V1(put); +PG_FUNCTION_INFO_V1(dump); + +Datum init(PG_FUNCTION_ARGS); +Datum get(PG_FUNCTION_ARGS); +Datum inquire_largest(PG_FUNCTION_ARGS); +Datum put(PG_FUNCTION_ARGS); +Datum dump(PG_FUNCTION_ARGS); + +char *space; +FreePageManager *fpm; + +Datum +init(PG_FUNCTION_ARGS) +{ + int64 size = PG_GETARG_INT64(0); + Size
Re: [HACKERS] pg_basebackup stream xlog to tar
On Sun, Oct 23, 2016 at 10:30 PM, Magnus Haganderwrote: > On Mon, Oct 17, 2016 at 7:37 AM, Michael Paquier > wrote: >> > But independent of this patch, actually putting that test in for non-tar >> > mode would probably not be a bad idea -- if that breaks, it's likely >> > both >> > break, after all. >> >> Agreed (you were able to break only tar upthread with your patch). One >> way to do that elegantly would be to: >> 1) extend slurp_dir to return only files that have a matching pattern. >> That's not difficult to do: >> --- a/src/test/perl/TestLib.pm >> +++ b/src/test/perl/TestLib.pm >> @@ -184,10 +184,14 @@ sub generate_ascii_string >> >> sub slurp_dir >> { >> - my ($dir) = @_; >> + my ($dir, $match_pattern) = @_; >> opendir(my $dh, $dir) >> or die "could not opendir \"$dir\": $!"; >> my @direntries = readdir $dh; >> + if (defined($match_pattern)) >> + { >> + @direntries = grep($match_pattern, @direntries); >> + } >> closedir $dh; >> return @direntries; >> } >> Sorting them at the same time may be a good idea.. >> 2) Add an option to pg_xlogdump to be able to output its output to a >> file. That would be awkward to rely on grabbing the output data from a >> pipe... On Windows particularly. Thinking about it, would that >> actually be useful to others? That's not a complicated patch. > > I think both of those would be worthwhile. Just for the testability in > itself, but such a flag to pg_xlogdump would probably be useful in other > cases as well, beyond just the testing. Looking quickly at the code, it does not seem that complicated... I may just send patches tomorrow for all those things and be done with it, all that on its new dedicated thread. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup stream xlog to tar
On Mon, Oct 17, 2016 at 7:37 AM, Michael Paquierwrote: > On Sat, Oct 15, 2016 at 8:51 AM, Magnus Hagander > wrote: > > Fixed. > > Ok, I had a extra look on the patch: > + The transactionn log files are written to a separate file > +called pg_xlog.tar. > + > s/transactionn/transaction/, and the markup should be on its own > line. > > + if (dir_data->sync) > + { > + if (fsync_fname(tmppath, false, progname) != 0) > + { > + close(fd); > + return NULL; > + } > + if (fsync_parent_path(tmppath, progname) != 0) > + { > + close(fd); > + return NULL; > + } > + } > Nit: squashing both things together would simplify the code. > > + else if (method == CLOSE_UNLINK > + ) > Your finger slipped here. > > Except that it looks in pretty good to me, so I am switching that as > ready for committer. > I incorporated those changes before pushing. > > But independent of this patch, actually putting that test in for non-tar > > mode would probably not be a bad idea -- if that breaks, it's likely both > > break, after all. > > Agreed (you were able to break only tar upthread with your patch). One > way to do that elegantly would be to: > 1) extend slurp_dir to return only files that have a matching pattern. > That's not difficult to do: > --- a/src/test/perl/TestLib.pm > +++ b/src/test/perl/TestLib.pm > @@ -184,10 +184,14 @@ sub generate_ascii_string > > sub slurp_dir > { > - my ($dir) = @_; > + my ($dir, $match_pattern) = @_; > opendir(my $dh, $dir) > or die "could not opendir \"$dir\": $!"; > my @direntries = readdir $dh; > + if (defined($match_pattern)) > + { > + @direntries = grep($match_pattern, @direntries); > + } > closedir $dh; > return @direntries; > } > Sorting them at the same time may be a good idea.. > 2) Add an option to pg_xlogdump to be able to output its output to a > file. That would be awkward to rely on grabbing the output data from a > pipe... On Windows particularly. Thinking about it, would that > actually be useful to others? That's not a complicated patch. > I think both of those would be worthwhile. Just for the testability in itself, but such a flag to pg_xlogdump would probably be useful in other cases as well, beyond just the testing. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] pg_basebackup stream xlog to tar
On Fri, Oct 21, 2016 at 2:02 PM, Michael Paquierwrote: > On Mon, Oct 17, 2016 at 2:37 PM, Michael Paquier > wrote: > > Except that it looks in pretty good to me, so I am switching that as > > ready for committer. > > + /* > +* Create pg_xlog/archive_status (and thus pg_xlog) so we can > write to > +* basedir/pg_xlog as the directory entry in the tar file may > arrive > +* later. > +*/ > + snprintf(statusdir, sizeof(statusdir), "%s/pg_xlog/archive_status", > +basedir); > > This part conflicts with f82ec32, where you need make pg_basebackup > aware of the backend version.. I promise that's the last conflict, at > least I don't have more patches planned in the area. > It also broke the tests and invalidated some documentation. But it was easy enough to fix. I've now applied this, so next time you get to do the merging :P Joking aside, please review and let me know if you can spot something I messed up in the final merge. Thanks for your repeated reviews! -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On Sun, Oct 23, 2016 at 5:18 PM, Vik Fearingwrote: > On 10/22/2016 06:00 PM, David Steele wrote: >> On 10/22/16 6:58 PM, Bruce Momjian wrote: >>> On Sat, Oct 22, 2016 at 07:33:56AM +0900, Michael Paquier wrote: On Sat, Oct 22, 2016 at 4:29 AM, Alvaro Herrera > Also +1 to renaming pg_subtrans to pg_subxact. Nice suggestion, good naming for consistency with the rest. >>> >>> Agreed. >> >> +1 > > +1 from me, too. OK, glad to see that we are reaching a conclusion to this thread. Attached are two patches. 0001 renames pg_clog to pg_xact. I found a comment in xact.c that was not yet updated. And 0002 renames pg_subtrans to pg_subxact. -- Michael From 1711e73fe59b37c0626c1382c6f044b28c01f3d9 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Sun, 23 Oct 2016 20:52:22 +0900 Subject: [PATCH 1/2] Rename pg_clog to pg_xact pg_upgrade is updated to handle the transfer correctly to post-10 clusters. --- doc/src/sgml/backup.sgml | 4 ++-- doc/src/sgml/catalogs.sgml | 4 ++-- doc/src/sgml/config.sgml | 2 +- doc/src/sgml/func.sgml | 2 +- doc/src/sgml/maintenance.sgml | 8 doc/src/sgml/ref/pg_resetxlog.sgml | 4 ++-- doc/src/sgml/ref/pg_rewind.sgml| 2 +- doc/src/sgml/storage.sgml | 10 +- doc/src/sgml/wal.sgml | 2 +- src/backend/access/heap/heapam.c | 4 ++-- src/backend/access/transam/README | 10 +- src/backend/access/transam/clog.c | 2 +- src/backend/access/transam/commit_ts.c | 2 +- src/backend/access/transam/multixact.c | 2 +- src/backend/access/transam/subtrans.c | 5 ++--- src/backend/access/transam/transam.c | 2 +- src/backend/access/transam/twophase.c | 4 ++-- src/backend/access/transam/xact.c | 18 +- src/backend/access/transam/xlog.c | 2 +- src/backend/commands/vacuum.c | 10 +- src/backend/postmaster/autovacuum.c| 2 +- src/backend/storage/buffer/README | 2 +- src/backend/storage/ipc/procarray.c| 4 ++-- src/backend/utils/time/tqual.c | 6 +++--- src/bin/initdb/initdb.c| 2 +- src/bin/pg_upgrade/exec.c | 6 ++ src/bin/pg_upgrade/pg_upgrade.c| 30 ++ src/include/access/slru.h | 4 ++-- 28 files changed, 83 insertions(+), 72 deletions(-) diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 6eaed1e..d22af0d 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -382,10 +382,10 @@ tar -cf backup.tar /usr/local/pgsql/data directories. This will not work because the information contained in these files is not usable without the commit log files, - pg_clog/*, which contain the commit status of + pg_xact/*, which contain the commit status of all transactions. A table file is only usable with this information. Of course it is also impossible to restore only a - table and the associated pg_clog data + table and the associated pg_xact data because that would render all other tables in the database cluster useless. So file system backups only work for complete backup and restoration of an entire database cluster. diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 29738b0..839b52a 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -1847,7 +1847,7 @@ All transaction IDs before this one have been replaced with a permanent (frozen) transaction ID in this table. This is used to track whether the table needs to be vacuumed in order to prevent transaction - ID wraparound or to allow pg_clog to be shrunk. Zero + ID wraparound or to allow pg_xact to be shrunk. Zero (InvalidTransactionId) if the relation is not a table. @@ -2514,7 +2514,7 @@ All transaction IDs before this one have been replaced with a permanent (frozen) transaction ID in this database. This is used to track whether the database needs to be vacuumed in order to prevent - transaction ID wraparound or to allow pg_clog to be shrunk. + transaction ID wraparound or to allow pg_xact to be shrunk. It is the minimum of the per-table pg_class.relfrozenxid values. diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index adab2f8..8a5d202 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -5834,7 +5834,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; Vacuum also allows removal of old files from the -pg_clog subdirectory, which is why the default +pg_xact subdirectory, which is why the default is a relatively low 200 million transactions. This parameter can only be set at server start, but the
Re: [HACKERS] pg_xlog error on the master
On Sun, Oct 23, 2016 at 5:05 PM, Venkata B Nagothiwrote: > I just did did a "git pull" to test one of my patches and i get the > following error : > > 2016-10-23 18:51:47.679 AEDT [31930] FATAL: could not open archive status > directory "pg_xlog/archive_status": No such file or directory > 2016-10-23 18:51:47.679 AEDT [31841] LOG: archiver process (PID 31930) > exited with exit code 1 > > is it expected ? No. > I notice that pg_xlog's name has been changed to pg_wal. I am not sure about > this. WAL archiving works correctly here, and tests in src/test/recovery/ work. Are you sure that you cleaned up up your source tree before recompiling? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 10/22/2016 06:00 PM, David Steele wrote: > On 10/22/16 6:58 PM, Bruce Momjian wrote: >> On Sat, Oct 22, 2016 at 07:33:56AM +0900, Michael Paquier wrote: >>> On Sat, Oct 22, 2016 at 4:29 AM, Alvaro Herrera >>> Also +1 to renaming pg_subtrans to pg_subxact. >>> >>> Nice suggestion, good naming for consistency with the rest. >> >> Agreed. > > +1 +1 from me, too. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_xlog error on the master
I just did did a "git pull" to test one of my patches and i get the following error : 2016-10-23 18:51:47.679 AEDT [31930] FATAL: could not open archive status directory "pg_xlog/archive_status": No such file or directory 2016-10-23 18:51:47.679 AEDT [31841] LOG: archiver process (PID 31930) exited with exit code 1 is it expected ? I notice that pg_xlog's name has been changed to pg_wal. I am not sure about this. Regards, Venkata B N Database Consultant / Architect