Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON
On Thu, Aug 11, 2016 at 2:58 PM, Venkata Balaji N <nag1...@gmail.com> wrote: > > On Tue, Aug 9, 2016 at 1:02 AM, Robert Haas <robertmh...@gmail.com> wrote: > >> On Mon, Aug 8, 2016 at 10:10 AM, Rahila Syed <rahilasye...@gmail.com> >> wrote: >> > Thank you for inputs everyone. >> > >> > The opinions on this thread can be classified into following >> > 1. Commit >> > This makes more sense as the user who is doing it would realise that the > transaction has been left open. > > >> Alternatively, I also think it would be sensible to issue an immediate >> COMMIT when the autocommit setting is changed from off to on. That >> was my first reaction. >> > > Issuing commit would indicate that, open transactions will be committed > which is not a good idea in my opinion. If the user is issuing AUTOCOMMIT = > ON, then it means all the transactions initiated after issuing this must be > committed, whereas it is committing the previously pending transactions as > well. > My apologies for confusing statement, correction - i meant, by setting autocommit=on, committing all the previously open transactions ( transactions opened when autocommit=off) may not be a good idea. What user meant by autocommit=on is that all the subsequent transactions must be committed. Regards, Venkata B N Fujitsu Australia
Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON
On Tue, Aug 9, 2016 at 1:02 AM, Robert Haaswrote: > On Mon, Aug 8, 2016 at 10:10 AM, Rahila Syed > wrote: > > Thank you for inputs everyone. > > > > The opinions on this thread can be classified into following > > 1. Commit > > 2. Rollback > > 3. Error > > 4. Warning > > > > As per opinion upthread, issuing implicit commit immediately after > switching > > autocommit to ON, can be unsafe if it was not desired. While I agree > that > > its difficult to judge users intention here, but if we were to base it on > > some assumption, the closest would be implicit COMMIT in my > opinion.There is > > higher likelihood of a user being happy with issuing a commit when > setting > > autocommit ON than a transaction being rolled back. Also there are quite > > some interfaces which provide this. > > > > As mentioned upthread, issuing a warning on switching back to autocommit > > will not be effective inside a script. It won't allow subsequent > commands to > > be committed as set autocommit to ON is not committed. Scripts will have > to > > be rerun with changes which will impact user friendliness. > > > > While I agree that issuing an ERROR and rolling back the transaction > ranks > > higher in safe behaviour, it is not as common (according to instances > stated > > upthread) as immediately committing any open transaction when switching > back > > to autocommit. > > I think I like the option of having psql issue an error. On the > server side, the transaction would still be open, but the user would > receive a psql error message and the autocommit setting would not be > changed. So the user could type COMMIT or ROLLBACK manually and then > retry changing the value of the setting. > This makes more sense as the user who is doing it would realise that the transaction has been left open. > Alternatively, I also think it would be sensible to issue an immediate > COMMIT when the autocommit setting is changed from off to on. That > was my first reaction. > Issuing commit would indicate that, open transactions will be committed which is not a good idea in my opinion. If the user is issuing AUTOCOMMIT = ON, then it means all the transactions initiated after issuing this must be committed, whereas it is committing the previously pending transactions as well. > Aborting the server-side transaction - with or without notice - > doesn't seem very reasonable. > Agreed. Traditionally, open transactions in the database must be left open until user issues a COMMIT or ROLLBACK. If the session is changed or killed, then, the transaction must be rolled back. Regards, Venkata B N Fujitsu Australia
Re: [HACKERS] recovery_target_time ignored ?
On Wed, Mar 25, 2015 at 1:28 AM, David Steele da...@pgmasters.net wrote: On 3/24/15 6:12 AM, Venkata Balaji N wrote: On Tue, Mar 24, 2015 at 9:54 AM, David Steele da...@pgmasters.net mailto:da...@pgmasters.net wrote: On 3/23/15 12:42 AM, Venkata Balaji N wrote: Hi, Assuming that this might require a patch, i am posting this in pgsql-hackers. Apologies, if this is not the appropriate mailing list to start this discussion. I performed a PITR and saw the below message in the log file is a bit confusing. 2015-03-23 13:49:09.816 GMT-10 DB= PID=4707 LOG: *database system was interrupted; last known up at 2015-03-23 10:30:26 GMT-10* 2015-03-23 13:49:09.817 GMT-10 DB= PID=4707 LOG: *starting point-in-time recovery to 2015-03-23 10:00:26+10* 2015-03-23 13:49:09.827 GMT-10 DB= PID=4707 LOG: restored log file 0001000B0020 from archive 2015-03-23 13:49:09.888 GMT-10 DB= PID=4707 LOG: redo starts at B/2090 2015-03-23 13:49:09.937 GMT-10 DB= PID=4707 LOG: consistent recovery state reached at B/20B8 2015-03-23 13:49:09.947 GMT-10 DB= PID=4707 LOG: restored log file 0001000B0021 from archive 2015-03-23 13:49:09.950 GMT-10 DB= PID=4707 LOG: *recovery stopping before commit of transaction 16267, time 2015-03-23 13:22:37.53007+10* By mistake i gave recovery_target_time as 10:00 GMT which is 25/30 minutes behind the backup start/end time registered in the backup_label. The parameter recovery_target_time is ignored and recovery proceeds further applying all the available WAL Archive files finally ends up bringing up the database. I think it would make sense if the recovery does not proceed any further and error out with a message like recovery_target_time is behind the backup time.. please consider using the backup taken prior to the recovery_target_time I just tried it with 9.3.5 and I do get an error: LOG: starting point-in-time recovery to 2015-03-23 17:26:02.721307-04 LOG: restored log file 00010003 from archive LOG: redo starts at 0/3C8 LOG: recovery stopping before commit of transaction 1001, time 2015-03-23 18:26:01.012593-04 LOG: redo done at 0/3000228 FATAL: requested recovery stop point is before consistent recovery point That makes more sense. This is what i was expecting to happen. Then, i think it is something to do with the timestamp format. Here's my recovery.conf file: restore_command = '/usr/bin/pg_backrest.pl http://pg_backrest.pl --stanza=db archive-get %f %p' recovery_target_time = '2015-03-23 17:26:02.721307 EDT' recovery.conf file is as follows : restore_command='cp /data/pgdata9400backup/pgwalarch9400backup/%f %p ' recovery_target_time='2015-03-23 10:00:26 GMT-10' recovery_target_inclusive='true' You have '2015-03-23 10:00:26 GMT-10' in recovery.conf but the log says 'starting point-in-time recovery to 2015-03-23 10:00:26+10'. Note the - vs +. This is my confusion too. I picked up the time format from the backup label. Could you check your log and recovery.conf and make sure the timezone offsets are actually different? I am not sure why the timestamp is taken as 2015-03-23 10:00:26+10 for 2015-03-23 10:00:26 GMT-10'. My another system's timestamp format is also AEDT. I did another test and I get the same problem. Below is the text from the log file. I gave a recovery_target_time almost a day behind the consistent recovery point. Still, the recovery_target_time is taken as *2015-03-23 11:07:10+11* for *2015-03-23 11:07:10 AEDT* 2015-03-24 18:42:44.608 AEDT LOG: database system was interrupted; last known up at *2015-03-24 18:20:53 AEDT* 2015-03-24 18:42:44.608 AEDT LOG: starting point-in-time recovery to *2015-03-23 11:07:10+11* cp: /disk3/pgwalarch9401/0001000300FE: No such file or directory 2015-03-24 18:42:44.626 AEDT LOG: record with zero length at 3/FE90 Below is my recovery.conf file restore_command='cp /disk3/pgwalarch9401/%f %p' recovery_target_time='*2015-03-23 11:07:10 AEDT*' recovery_target_inclusive=true I am checking if this has something to do with my system timestamp format. Not sure what am i missing. Do i need to give any special time format ? This is a weird one. I can reproduce the problem by setting timezone=timezone='Australia/Sydney'. I also tried setting log_timezone=timezone='Australia/Sydney' and my system clock to 'Australia/Sydney' but I still saw the same issue. I'm testing this using unit tests for some backup software I'm
Re: [HACKERS] pg_rewind in contrib
I have committed this, with some more kibitzing. hope I have not missed any comments given so far. Many thanks for the review, and please continue reviewing and testing it :-). I have been testing the pg_rewind and have an analysis to share along with few questions - I had a streaming replication setup with one master and one slave running successfully. Test 1 : - Killed postgres process on master and promoted slave. Both were in sync earlier. - performed some operations (data changes) on newly promoted slave node and shutdown - Executed pg_rewind on old master and got the below message *target server must be shut down cleanly* *Failure, exiting* If the master is crashed or killed abruptly, it may not be possible to do a rewind. Is my understanding correct ? Test 2 : - On a successfully running streaming replication with one master and one slave, i did a clean shutdown of master - promoted slave - performed some operations (data changes) on newly promoted slave and did a clean shutdown - Executed pg_rewind on the old master to sync with the latest changes on new master. I got the below message *The servers diverged at WAL position 0/A298 on timeline 1.* *No rewind required.* I am not getting this too. Regards, Venkata Balaji N
Re: [HACKERS] recovery_target_time ignored ?
On Tue, Mar 24, 2015 at 9:54 AM, David Steele da...@pgmasters.net wrote: On 3/23/15 12:42 AM, Venkata Balaji N wrote: Hi, Assuming that this might require a patch, i am posting this in pgsql-hackers. Apologies, if this is not the appropriate mailing list to start this discussion. I performed a PITR and saw the below message in the log file is a bit confusing. 2015-03-23 13:49:09.816 GMT-10 DB= PID=4707 LOG: *database system was interrupted; last known up at 2015-03-23 10:30:26 GMT-10* 2015-03-23 13:49:09.817 GMT-10 DB= PID=4707 LOG: *starting point-in-time recovery to 2015-03-23 10:00:26+10* 2015-03-23 13:49:09.827 GMT-10 DB= PID=4707 LOG: restored log file 0001000B0020 from archive 2015-03-23 13:49:09.888 GMT-10 DB= PID=4707 LOG: redo starts at B/2090 2015-03-23 13:49:09.937 GMT-10 DB= PID=4707 LOG: consistent recovery state reached at B/20B8 2015-03-23 13:49:09.947 GMT-10 DB= PID=4707 LOG: restored log file 0001000B0021 from archive 2015-03-23 13:49:09.950 GMT-10 DB= PID=4707 LOG: *recovery stopping before commit of transaction 16267, time 2015-03-23 13:22:37.53007+10* By mistake i gave recovery_target_time as 10:00 GMT which is 25/30 minutes behind the backup start/end time registered in the backup_label. The parameter recovery_target_time is ignored and recovery proceeds further applying all the available WAL Archive files finally ends up bringing up the database. I think it would make sense if the recovery does not proceed any further and error out with a message like recovery_target_time is behind the backup time.. please consider using the backup taken prior to the recovery_target_time I just tried it with 9.3.5 and I do get an error: LOG: starting point-in-time recovery to 2015-03-23 17:26:02.721307-04 LOG: restored log file 00010003 from archive LOG: redo starts at 0/3C8 LOG: recovery stopping before commit of transaction 1001, time 2015-03-23 18:26:01.012593-04 LOG: redo done at 0/3000228 FATAL: requested recovery stop point is before consistent recovery point That makes more sense. This is what i was expecting to happen. Then, i think it is something to do with the timestamp format. Here's my recovery.conf file: restore_command = '/usr/bin/pg_backrest.pl --stanza=db archive-get %f %p' recovery_target_time = '2015-03-23 17:26:02.721307 EDT' recovery.conf file is as follows : restore_command='cp /data/pgdata9400backup/pgwalarch9400backup/%f %p ' recovery_target_time='2015-03-23 10:00:26 GMT-10' recovery_target_inclusive='true' You have '2015-03-23 10:00:26 GMT-10' in recovery.conf but the log says 'starting point-in-time recovery to 2015-03-23 10:00:26+10'. Note the - vs +. This is my confusion too. I picked up the time format from the backup label. Could you check your log and recovery.conf and make sure the timezone offsets are actually different? I am not sure why the timestamp is taken as 2015-03-23 10:00:26+10 for 2015-03-23 10:00:26 GMT-10'. My another system's timestamp format is also AEDT. I did another test and I get the same problem. Below is the text from the log file. I gave a recovery_target_time almost a day behind the consistent recovery point. Still, the recovery_target_time is taken as *2015-03-23 11:07:10+11* for *2015-03-23 11:07:10 AEDT* 2015-03-24 18:42:44.608 AEDT LOG: database system was interrupted; last known up at *2015-03-24 18:20:53 AEDT* 2015-03-24 18:42:44.608 AEDT LOG: starting point-in-time recovery to *2015-03-23 11:07:10+11* cp: /disk3/pgwalarch9401/0001000300FE: No such file or directory 2015-03-24 18:42:44.626 AEDT LOG: record with zero length at 3/FE90 Below is my recovery.conf file restore_command='cp /disk3/pgwalarch9401/%f %p' recovery_target_time='*2015-03-23 11:07:10 AEDT*' recovery_target_inclusive=true I am checking if this has something to do with my system timestamp format. Not sure what am i missing. Do i need to give any special time format ? Regards, Venkata Balaji N
[HACKERS] recovery_target_time ignored ?
Hi, Assuming that this might require a patch, i am posting this in pgsql-hackers. Apologies, if this is not the appropriate mailing list to start this discussion. I performed a PITR and saw the below message in the log file is a bit confusing. 2015-03-23 13:49:09.816 GMT-10 DB= PID=4707 LOG: * database system was interrupted; last known up at 2015-03-23 10:30:26 GMT-10* 2015-03-23 13:49:09.817 GMT-10 DB= PID=4707 LOG: * starting point-in-time recovery to 2015-03-23 10:00:26+10* 2015-03-23 13:49:09.827 GMT-10 DB= PID=4707 LOG: restored log file 0001000B0020 from archive 2015-03-23 13:49:09.888 GMT-10 DB= PID=4707 LOG: redo starts at B/2090 2015-03-23 13:49:09.937 GMT-10 DB= PID=4707 LOG: consistent recovery state reached at B/20B8 2015-03-23 13:49:09.947 GMT-10 DB= PID=4707 LOG: restored log file 0001000B0021 from archive 2015-03-23 13:49:09.950 GMT-10 DB= PID=4707 LOG: *recovery stopping before commit of transaction 16267, time 2015-03-23 13:22:37.53007+10* By mistake i gave recovery_target_time as 10:00 GMT which is 25/30 minutes behind the backup start/end time registered in the backup_label. The parameter recovery_target_time is ignored and recovery proceeds further applying all the available WAL Archive files finally ends up bringing up the database. I think it would make sense if the recovery does not proceed any further and error out with a message like recovery_target_time is behind the backup time.. please consider using the backup taken prior to the recovery_target_time recovery.conf file is as follows : restore_command='cp /data/pgdata9400backup/pgwalarch9400backup/%f %p ' recovery_target_time='2015-03-23 10:00:26 GMT-10' recovery_target_inclusive='true' If this requires a patch, i would like to take it up. Regards, Venkata Balaji N
Re: [HACKERS] Streaming replication and WAL archive interactions
Here's a first cut at this. It includes the changes from your standby_wal_archiving_v1.patch, so you get that behaviour if you set archive_mode='always', and the new behaviour I wanted with archive_mode='shared'. I wrote it on top of the other patch I posted recently to not archive bogus recycled WAL segments after promotion ( http://www.postgresql.org/message-id/549489fa.4010...@vmware.com), but it seems to apply without it too. I suggest reading the documentation changes first, it hopefully explains pretty well how to use this. The code should work too, and comments on that are welcome too, but I haven't tested it much. I'll do more testing next week. Patch did get applied successfully to the latest master. Can you please rebase. Regards, Venkata Balaji N
Re: [HACKERS] Redesigning checkpoint_segments
I am wondering a bit about interaction with wal_keep_segments. One thing is that wal_keep_segments is still specified in number of segments and not size units, maybe it would be worth to change it also? And the other thing is that, if set, the wal_keep_segments is the real max_wal_size from the user perspective (not from perspective of the algorithm in this patch, but user does not really care about that) which is somewhat weird given the naming. In my opinion - I think wal_keep_segments being number of segments would help a lot. In my experience, while handling production databases, to arrive at an optimal value for wal_keep_segments, we go by calculating number of segments getting generated in wal archive destination (hourly or daily basis), this would further help us calculate how many segments to keep considering various other factors in an replication environment to ensure master has enough WALs in pg_xlog when standby comes back up after the outage. Ofcourse, if we can calculate number-of-segments, we can calculate the same in terms of size too. Calculating number of segments would be more feasible. Regards, VBN
Re: [HACKERS] Redesigning checkpoint_segments
at the code and do not have any comments from my end. Regards, Venkata Balaji N
Re: [HACKERS] Redesigning checkpoint_segments
Missed adding pgsql-hackers group while replying. Regards, Venkata Balaji N On Thu, Feb 5, 2015 at 12:48 PM, Venkata Balaji N nag1...@gmail.com wrote: On Fri, Jan 30, 2015 at 7:58 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/30/2015 04:48 AM, Venkata Balaji N wrote: I performed series of tests for this patch and would like to share the results. My comments are in-line. Thanks for the testing! *Test 1 :* In this test, i see removed+recycled segments = 3 (except for the first 3 checkpoint cycles) and has been steady through out until the INSERT operation completed. Actual calculation of CheckPointSegments = 3.2 ( is getting rounded up to 3 ) pg_xlog size is 128M and has increased to 160M max during the INSERT operation. shared_buffers = 128M checkpoint_wal_size = 128M min_recycle_wal_size = 80M checkpoint_timeout = 5min Hmm, did I understand correctly that pg_xlog peaked at 160MB, but most of the stayed at 128 MB? That sounds like it's working as designed; checkpoint_wal_size is not a hard limit after all. Yes, the pg_xlog directory size peaked to 160MB at times and most of the time stayed at 128MB. I did make an observation in an other round of latest test, my observations are below. b) Are the two GUCs, checkpoint_wal_size, and min_recycle_wal_size, intuitive to set? During my tests, I did not observe the significance of min_recycle_wal_size parameter yet. Ofcourse, i had sufficient disk space for pg_xlog. I would like to understand more about min_recycle_wal_size parameter. In theory, i only understand from the note in the patch that if the disk space usage falls below certain threshold, min_recycle_wal_size number of WALs will be removed to accommodate future pg_xlog segments. I will try to test this out. Please let me know if there is any specific test to understand min_recycle_wal_size behaviour. min_recycle_wal_size comes into play when you have only light load, so that checkpoints are triggered by checkpoint_timeout rather than checkpoint_wal_size. In that scenario, the WAL usage will shrink down to min_recycle_wal_size, but not below that. Did that explanation help? Can you suggest changes to the docs to make it more clear? Thanks for the explanation. I see the below note from the patch, i think it should also say that minimum wal size on the disk will be min_recycle_wal_size during the light load and idle situations. I think the name of the parameter name min_recycle_wal_size implies something slightly different. It does not give an impression that it is the minimum wal size on the disk during light loads. I agree with Josh Berkus that the parameter (min_recycle_wal_size) name must be something like min_wal_size which makes more sense. + varname*wal_recycle_min_size*/ puts a minimum on the amount of WAL files + recycled for future usage; that much WAL is always recycled for future use, + even if the system is idle and the WAL usage estimate suggests that little + WAL is needed. + /para Note : in wal.sgml, the parameter name is mentioned as wal_recycle_min_size. That must be changed to min_recycle_wal_size. Another round of test : I raised checkpoint_wal_size to 10512 MB which is about 10GB and kept min_recycle_wal_size at 128 MB (with checkpoint_timeout = 5min). The checkpoints timed out and started recycling about 2 GB segments regularly, below are the checkpoint logs - I started loading the data of size more than 100GB. TimeStamp=2015-02-05 10:22:40.323 GMT-10 DB= SID=54d2af22.65b4 User= LOG: checkpoint complete: wrote 83998 buffers (64.1%); 0 transaction log file(s) added, 0 removed, 135 recycled; write=95.687 s, sync=25.845 s, total=121.866 s; sync files=18, longest=10.306 s, average=1.435 s; distance=2271524 KB, estimate=2300497 KB TimeStamp=2015-02-05 10:25:38.875 GMT-10 DB= SID=54d2af22.65b4 User= LOG: checkpoint starting: time TimeStamp=2015-02-05 10:27:50.955 GMT-10 DB= SID=54d2af22.65b4 User= LOG: checkpoint complete: wrote 83216 buffers (63.5%); 0 transaction log file(s) added, 0 removed, 146 recycled; write=96.951 s, sync=34.814 s, total=132.079 s; sync files=18, longest=9.535 s, average=1.934 s; distance=2229416 KB, estimate=2293388 KB TimeStamp=2015-02-05 10:30:38.786 GMT-10 DB= SID=54d2af22.65b4 User= LOG: checkpoint starting: time TimeStamp=2015-02-05 10:32:20.332 GMT-10 DB= SID=54d2af22.65b4 User= LOG: checkpoint complete: wrote 82409 buffers (62.9%); 0 transaction log file(s) added, 0 removed, 131 recycled; write=94.712 s, sync=6.516 s, total=101.545 s; sync files=18, longest=2.645 s, average=0.362 s; distance=2131805 KB, estimate=2277230 KB TimeStamp=2015-02-05 10:35:38.788 GMT-10 DB= SID=54d2af22.65b4 User= LOG: checkpoint starting: time TimeStamp=2015-02-05 10:37:35.883 GMT-10 DB= SID=54d2af22.65b4 User= LOG: checkpoint complete: wrote 87821 buffers (67.0%); 0 transaction log file(s) added, 0 removed, 134 recycled; write=99.461