Re: Replication slot stats misgivings
On Mon, May 24, 2021 at 10:09 AM vignesh C wrote: > > On Mon, May 24, 2021 at 9:38 AM Amit Kapila wrote: > > > > On Thu, May 13, 2021 at 11:30 AM vignesh C wrote: > > > > > > > Do we want to update the information about pg_stat_replication_slots > > at the following place in docs > > https://www.postgresql.org/docs/devel/logicaldecoding-catalogs.html? > > > > If so, feel free to submit the patch for it? > > Adding it will be useful, the attached patch has the changes for the same. > Thanks for the patch, pushed. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Mon, May 24, 2021 at 9:38 AM Amit Kapila wrote: > > On Thu, May 13, 2021 at 11:30 AM vignesh C wrote: > > > > Do we want to update the information about pg_stat_replication_slots > at the following place in docs > https://www.postgresql.org/docs/devel/logicaldecoding-catalogs.html? > > If so, feel free to submit the patch for it? Adding it will be useful, the attached patch has the changes for the same. Regards, Vignesh From 133856438c7e027c392e640e380a644063fd53f8 Mon Sep 17 00:00:00 2001 From: vignesh Date: Mon, 24 May 2021 10:03:12 +0530 Subject: [PATCH v1] Added pg_stat_replication_slots view information in "System Catalogs Related to Logical Decoding" documentation. pg_stat_replication_slots information was missing in "System Catalogs Related to Logical Decoding" documentation. Fixed it by adding it. --- doc/src/sgml/logicaldecoding.sgml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml index a7ec5c3577..87a37a5587 100644 --- a/doc/src/sgml/logicaldecoding.sgml +++ b/doc/src/sgml/logicaldecoding.sgml @@ -401,6 +401,8 @@ postgres=# select * from pg_logical_slot_get_changes('regression_slot', NULL, NU view provide information about the current state of replication slots and streaming replication connections respectively. These views apply to both physical and logical replication. +The pg_stat_replication_slots +view provides statistics information about the logical replication slots. -- 2.25.1
Re: Replication slot stats misgivings
On Thu, May 13, 2021 at 11:30 AM vignesh C wrote: > Do we want to update the information about pg_stat_replication_slots at the following place in docs https://www.postgresql.org/docs/devel/logicaldecoding-catalogs.html? If so, feel free to submit the patch for it? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, May 13, 2021 at 11:21 AM Amit Kapila wrote: > > On Wed, May 12, 2021 at 4:02 PM Masahiko Sawada wrote: > > > > On Wed, May 12, 2021 at 1:19 PM vignesh C wrote: > > > > > > > I think the part of the test that tests the stats after resetting it > > > > might give different results. This can happen because in the previous > > > > test we spill multiple times (spill_count is 12 in my testing) and it > > > > is possible that some of the spill stats messages is received by stats > > > > collector after the reset message. If this theory is correct then it > > > > better that we remove the test for reset stats and the test after it > > > > "decode and check stats again.". This is not directly related to your > > > > patch or buildfarm failure but I guess this can happen and we might > > > > see such a failure in future. > > > > Good point. I agree to remove this test. > > > > > > > > I agree with your analysis to remove that test. Attached patch has the > > > changes for the same. > > > > Thank you for the patch. The patch looks good to me. I also agree that > > removing the test doesn't reduce the test coverage. > > > > Thanks, I have pushed the patch. > Thanks for pushing the patch. Regards, Vignesh
Re: Replication slot stats misgivings
On Wed, May 12, 2021 at 4:02 PM Masahiko Sawada wrote: > > On Wed, May 12, 2021 at 1:19 PM vignesh C wrote: > > > > > I think the part of the test that tests the stats after resetting it > > > might give different results. This can happen because in the previous > > > test we spill multiple times (spill_count is 12 in my testing) and it > > > is possible that some of the spill stats messages is received by stats > > > collector after the reset message. If this theory is correct then it > > > better that we remove the test for reset stats and the test after it > > > "decode and check stats again.". This is not directly related to your > > > patch or buildfarm failure but I guess this can happen and we might > > > see such a failure in future. > > Good point. I agree to remove this test. > > > > > I agree with your analysis to remove that test. Attached patch has the > > changes for the same. > > Thank you for the patch. The patch looks good to me. I also agree that > removing the test doesn't reduce the test coverage. > Thanks, I have pushed the patch. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, May 12, 2021 at 1:19 PM vignesh C wrote: > > On Wed, May 12, 2021 at 9:08 AM Amit Kapila wrote: > > > > On Wed, May 12, 2021 at 7:53 AM Amit Kapila wrote: > > > > > > On Wed, May 12, 2021 at 4:00 AM Masahiko Sawada > > > wrote: > > > > > > > > Ugh, since by commit 592f00f8de we send slot stats every after > > > > spil/stream it’s possible that we report slot stats that have non-zero > > > > counters for spill_bytes/txns and zeroes for total_bytes/txns. It > > > > seems to me it’s legitimate that the slot stats view shows non-zero > > > > values for spill_bytes/txns and zero values for total_bytes/txns > > > > during decoding a large transaction. So I think we can fix the test > > > > script so that it checks only spill_bytes/txns when checking spilled > > > > transactions. > > > > > > > > > > Your analysis and fix look correct to me. > > > > > > > I think the part of the test that tests the stats after resetting it > > might give different results. This can happen because in the previous > > test we spill multiple times (spill_count is 12 in my testing) and it > > is possible that some of the spill stats messages is received by stats > > collector after the reset message. If this theory is correct then it > > better that we remove the test for reset stats and the test after it > > "decode and check stats again.". This is not directly related to your > > patch or buildfarm failure but I guess this can happen and we might > > see such a failure in future. Good point. I agree to remove this test. > > I agree with your analysis to remove that test. Attached patch has the > changes for the same. Thank you for the patch. The patch looks good to me. I also agree that removing the test doesn't reduce the test coverage. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
vignesh C writes: > On Wed, May 12, 2021 at 9:59 AM Tom Lane wrote: >> Is there any value in converting the test case into a TAP test that >> could be more flexible about the expected output? I'm mainly wondering >> whether there are any code paths that this test forces the server through, >> which would otherwise lack coverage. > Removing this test does not reduce code coverage. This test is > basically to decode and check the stats again, it is kind of a > repetitive test. The problem with this test here is that when a > transaction is spilled, the statistics for the spill transaction will > be sent to the statistics collector as and when the transaction is > spilled. This test sends spill stats around 12 times. The test expects > to reset the stats and check the stats gets updated when we get the > changes. We cannot validate reset slot stats results here, as it could > be possible that in some machines the stats collector receives the > spilled transaction stats after getting reset slots. This same problem > will exist with tap tests too. So I feel it is better to remove this > test. OK, I'm satisfied as long as we've considered the code-coverage angle. regards, tom lane
Re: Replication slot stats misgivings
On Wed, May 12, 2021 at 9:59 AM Tom Lane wrote: > > vignesh C writes: > > I agree with your analysis to remove that test. Attached patch has the > > changes for the same. > > Is there any value in converting the test case into a TAP test that > could be more flexible about the expected output? I'm mainly wondering > whether there are any code paths that this test forces the server through, > which would otherwise lack coverage. Removing this test does not reduce code coverage. This test is basically to decode and check the stats again, it is kind of a repetitive test. The problem with this test here is that when a transaction is spilled, the statistics for the spill transaction will be sent to the statistics collector as and when the transaction is spilled. This test sends spill stats around 12 times. The test expects to reset the stats and check the stats gets updated when we get the changes. We cannot validate reset slot stats results here, as it could be possible that in some machines the stats collector receives the spilled transaction stats after getting reset slots. This same problem will exist with tap tests too. So I feel it is better to remove this test. Regards, Vignesh
Re: Replication slot stats misgivings
vignesh C writes: > I agree with your analysis to remove that test. Attached patch has the > changes for the same. Is there any value in converting the test case into a TAP test that could be more flexible about the expected output? I'm mainly wondering whether there are any code paths that this test forces the server through, which would otherwise lack coverage. regards, tom lane
Re: Replication slot stats misgivings
On Wed, May 12, 2021 at 9:08 AM Amit Kapila wrote: > > On Wed, May 12, 2021 at 7:53 AM Amit Kapila wrote: > > > > On Wed, May 12, 2021 at 4:00 AM Masahiko Sawada > > wrote: > > > > > > Ugh, since by commit 592f00f8de we send slot stats every after > > > spil/stream it’s possible that we report slot stats that have non-zero > > > counters for spill_bytes/txns and zeroes for total_bytes/txns. It > > > seems to me it’s legitimate that the slot stats view shows non-zero > > > values for spill_bytes/txns and zero values for total_bytes/txns > > > during decoding a large transaction. So I think we can fix the test > > > script so that it checks only spill_bytes/txns when checking spilled > > > transactions. > > > > > > > Your analysis and fix look correct to me. > > > > I think the part of the test that tests the stats after resetting it > might give different results. This can happen because in the previous > test we spill multiple times (spill_count is 12 in my testing) and it > is possible that some of the spill stats messages is received by stats > collector after the reset message. If this theory is correct then it > better that we remove the test for reset stats and the test after it > "decode and check stats again.". This is not directly related to your > patch or buildfarm failure but I guess this can happen and we might > see such a failure in future. I agree with your analysis to remove that test. Attached patch has the changes for the same. Regards, Vignesh diff --git a/contrib/test_decoding/expected/stats.out b/contrib/test_decoding/expected/stats.out index 7d174ee8d2..206c0a126e 100644 --- a/contrib/test_decoding/expected/stats.out +++ b/contrib/test_decoding/expected/stats.out @@ -111,48 +111,10 @@ SELECT wait_for_decode_stats(false, true); (1 row) -SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes FROM pg_stat_replication_slots; - slot_name | spill_txns | spill_count | total_txns | total_bytes ++-++- - regression_slot_stats | t | t | t | t -(1 row) - --- reset the slot stats, and wait for stats collector to reset -SELECT pg_stat_reset_replication_slot('regression_slot_stats'); - pg_stat_reset_replication_slot - - -(1 row) - -SELECT wait_for_decode_stats(true, true); - wait_for_decode_stats - -(1 row) - -SELECT slot_name, spill_txns, spill_count, total_txns, total_bytes FROM pg_stat_replication_slots; - slot_name | spill_txns | spill_count | total_txns | total_bytes ++-++- - regression_slot_stats | 0 | 0 | 0 | 0 -(1 row) - --- decode and check stats again. -SELECT count(*) FROM pg_logical_slot_peek_changes('regression_slot_stats', NULL, NULL, 'skip-empty-xacts', '1'); - count - 5002 -(1 row) - -SELECT wait_for_decode_stats(false, true); - wait_for_decode_stats - -(1 row) - -SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes FROM pg_stat_replication_slots; - slot_name | spill_txns | spill_count | total_txns | total_bytes ++-++- - regression_slot_stats | t | t | t | t +SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count FROM pg_stat_replication_slots; + slot_name | spill_txns | spill_count +---++- + regression_slot_stats | t | t (1 row) -- Ensure stats can be repeatedly accessed using the same stats snapshot. See diff --git a/contrib/test_decoding/sql/stats.sql b/contrib/test_decoding/sql/stats.sql index 263568b00c..67462ca27f 100644 --- a/contrib/test_decoding/sql/stats.sql +++ b/contrib/test_decoding/sql/stats.sql @@ -73,17 +73,7 @@ SELECT count(*) FROM pg_logical_slot_peek_changes('regression_slot_stats', NULL, -- exact stats count as that can vary if any background transaction (say by -- autovacuum) happens in parallel to the main transaction. SELECT wait_for_decode_stats(false, true); -SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes FROM pg_stat_replication_slots; - --- reset the slot stats, and wait for stats collector to reset -SELECT pg_stat_reset_replication_slot('regression_slot_stats'); -SELECT wait_for_decode_stats(true, true); -SELECT slot_name, spill_txns, spill_count, total_txns, total_bytes FROM pg_stat_replication_slots; - --- decode and check stats again. -SELECT count(*) FROM pg_logical_slot_peek_changes('regression_slot_stats', NULL, NULL, 'skip-empty-xacts',
Re: Replication slot stats misgivings
On Wed, May 12, 2021 at 7:53 AM Amit Kapila wrote: > > On Wed, May 12, 2021 at 4:00 AM Masahiko Sawada wrote: > > > > Ugh, since by commit 592f00f8de we send slot stats every after > > spil/stream it’s possible that we report slot stats that have non-zero > > counters for spill_bytes/txns and zeroes for total_bytes/txns. It > > seems to me it’s legitimate that the slot stats view shows non-zero > > values for spill_bytes/txns and zero values for total_bytes/txns > > during decoding a large transaction. So I think we can fix the test > > script so that it checks only spill_bytes/txns when checking spilled > > transactions. > > > > Your analysis and fix look correct to me. > I think the part of the test that tests the stats after resetting it might give different results. This can happen because in the previous test we spill multiple times (spill_count is 12 in my testing) and it is possible that some of the spill stats messages is received by stats collector after the reset message. If this theory is correct then it better that we remove the test for reset stats and the test after it "decode and check stats again.". This is not directly related to your patch or buildfarm failure but I guess this can happen and we might see such a failure in future. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, May 12, 2021 at 4:00 AM Masahiko Sawada wrote: > > On Wed, May 12, 2021 at 6:32 AM Tom Lane wrote: > > > > Amit Kapila writes: > > > I have closed this open item. > > > > That seems a little premature, considering that the > > contrib/test_decoding/sql/stats.sql test case is still failing regularly. > > Thank you for reporting. > > Ugh, since by commit 592f00f8de we send slot stats every after > spil/stream it’s possible that we report slot stats that have non-zero > counters for spill_bytes/txns and zeroes for total_bytes/txns. It > seems to me it’s legitimate that the slot stats view shows non-zero > values for spill_bytes/txns and zero values for total_bytes/txns > during decoding a large transaction. So I think we can fix the test > script so that it checks only spill_bytes/txns when checking spilled > transactions. > Your analysis and fix look correct to me. I'll test and push your patch if I don't see any problem with it. > For the record, during streaming transactions, IIUC this kind of thing > doesn’t happen since we update both total_bytes/txns and > stream_bytes/txns before reporting slot stats. > Right, because during streaming, we send the data to the decoding plugin. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, May 12, 2021 at 6:32 AM Tom Lane wrote: > > Amit Kapila writes: > > I have closed this open item. > > That seems a little premature, considering that the > contrib/test_decoding/sql/stats.sql test case is still failing regularly. Thank you for reporting. Ugh, since by commit 592f00f8de we send slot stats every after spil/stream it’s possible that we report slot stats that have non-zero counters for spill_bytes/txns and zeroes for total_bytes/txns. It seems to me it’s legitimate that the slot stats view shows non-zero values for spill_bytes/txns and zero values for total_bytes/txns during decoding a large transaction. So I think we can fix the test script so that it checks only spill_bytes/txns when checking spilled transactions. For the record, during streaming transactions, IIUC this kind of thing doesn’t happen since we update both total_bytes/txns and stream_bytes/txns before reporting slot stats. I've attached a patch to fix it. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ diff --git a/contrib/test_decoding/expected/stats.out b/contrib/test_decoding/expected/stats.out index 7d174ee8d2..19bbd76c10 100644 --- a/contrib/test_decoding/expected/stats.out +++ b/contrib/test_decoding/expected/stats.out @@ -111,10 +111,10 @@ SELECT wait_for_decode_stats(false, true); (1 row) -SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes FROM pg_stat_replication_slots; - slot_name | spill_txns | spill_count | total_txns | total_bytes ++-++- - regression_slot_stats | t | t | t | t +SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count FROM pg_stat_replication_slots; + slot_name | spill_txns | spill_count +---++- + regression_slot_stats | t | t (1 row) -- reset the slot stats, and wait for stats collector to reset @@ -149,10 +149,10 @@ SELECT wait_for_decode_stats(false, true); (1 row) -SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes FROM pg_stat_replication_slots; - slot_name | spill_txns | spill_count | total_txns | total_bytes ++-++- - regression_slot_stats | t | t | t | t +SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count FROM pg_stat_replication_slots; + slot_name | spill_txns | spill_count +---++- + regression_slot_stats | t | t (1 row) -- Ensure stats can be repeatedly accessed using the same stats snapshot. See diff --git a/contrib/test_decoding/sql/stats.sql b/contrib/test_decoding/sql/stats.sql index 263568b00c..93b6103034 100644 --- a/contrib/test_decoding/sql/stats.sql +++ b/contrib/test_decoding/sql/stats.sql @@ -73,7 +73,7 @@ SELECT count(*) FROM pg_logical_slot_peek_changes('regression_slot_stats', NULL, -- exact stats count as that can vary if any background transaction (say by -- autovacuum) happens in parallel to the main transaction. SELECT wait_for_decode_stats(false, true); -SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes FROM pg_stat_replication_slots; +SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count FROM pg_stat_replication_slots; -- reset the slot stats, and wait for stats collector to reset SELECT pg_stat_reset_replication_slot('regression_slot_stats'); @@ -83,7 +83,7 @@ SELECT slot_name, spill_txns, spill_count, total_txns, total_bytes FROM pg_stat_ -- decode and check stats again. SELECT count(*) FROM pg_logical_slot_peek_changes('regression_slot_stats', NULL, NULL, 'skip-empty-xacts', '1'); SELECT wait_for_decode_stats(false, true); -SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count, total_txns > 0 AS total_txns, total_bytes > 0 AS total_bytes FROM pg_stat_replication_slots; +SELECT slot_name, spill_txns > 0 AS spill_txns, spill_count > 0 AS spill_count FROM pg_stat_replication_slots; -- Ensure stats can be repeatedly accessed using the same stats snapshot. See -- https://postgr.es/m/20210317230447.c7uc4g3vbs4wi32i%40alap3.anarazel.de
Re: Replication slot stats misgivings
Amit Kapila writes: > I have closed this open item. That seems a little premature, considering that the contrib/test_decoding/sql/stats.sql test case is still failing regularly. https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=locust=2021-05-11%2019%3A14%3A53 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus=2021-05-07%2010%3A20%3A21 regards, tom lane
Re: Replication slot stats misgivings
On Fri, May 7, 2021 at 8:03 AM Amit Kapila wrote: > > Thanks for the summarization. I don't find anything that is left > unaddressed. I think we can wait for a day or two to see if Andres or > anyone else sees anything that is left unaddressed and then we can > close the open item. > I have closed this open item. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Fri, May 7, 2021 at 6:10 AM Masahiko Sawada wrote: > > On Sat, Mar 20, 2021 at 3:52 AM Andres Freund wrote: > > > > Hi, > > > > I started to write this as a reply to > > https://postgr.es/m/20210318015105.dcfa4ceybdjubf2i%40alap3.anarazel.de > > but I think it doesn't really fit under that header anymore. > > > > On 2021-03-17 18:51:05 -0700, Andres Freund wrote: > > > It does make it easier for the shared memory stats patch, because if > > > there's a fixed number + location, the relevant stats reporting doesn't > > > need to go through a hashtable with the associated locking. I guess > > > that may have colored my perception that it's better to just have a > > > statically sized memory allocation for this. Noteworthy that SLRU stats > > > are done in a fixed size allocation as well... > > Through a long discussion, all review comments pointed out here have > been addressed. I summarized that individual review comments are fixed > by which commit to avoid any confusion later. > > > > > As part of reviewing the replication slot stats patch I looked at > > replication slot stats a fair bit, and I've a few misgivings. First, > > about the pgstat.c side of things: > > > > - If somehow slot stat drop messages got lost (remember pgstat > > communication is lossy!), we'll just stop maintaining stats for slots > > created later, because there'll eventually be no space for keeping > > stats for another slot. > > > > - If max_replication_slots was lowered between a restart, > > pgstat_read_statfile() will happily write beyond the end of > > replSlotStats. > > > > - pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I > > think pgstat.c has absolutely no business doing things on that level. > > > > - We do a linear search through all replication slots whenever receiving > > stats for a slot. Even though there'd be a perfectly good index to > > just use all throughout - the slots index itself. It looks to me like > > slots stat reports can be fairly frequent in some workloads, so that > > doesn't seem great. > > Fixed by 3fa17d37716. > > > > > - PgStat_ReplSlotStats etc use slotname[NAMEDATALEN]. Why not just NameData? > > > > - pgstat_report_replslot() already has a lot of stats parameters, it > > seems likely that we'll get more. Seems like we should just use a > > struct of stats updates. > > Fixed by cca57c1d9bf. > > > > > And then more generally about the feature: > > - If a slot was used to stream out a large amount of changes (say an > > initial data load), but then replication is interrupted before the > > transaction is committed/aborted, stream_bytes will not reflect the > > many gigabytes of data we may have sent. > > Fixed by 592f00f8d. > > > - I seems weird that we went to the trouble of inventing replication > > slot stats, but then limit them to logical slots, and even there don't > > record the obvious things like the total amount of data sent. > > Fixed by f5fc2f5b23d. > > > > > I think the best way to address the more fundamental "pgstat related" > > complaints is to change how replication slot stats are > > "addressed". Instead of using the slots name, report stats using the > > index in ReplicationSlotCtl->replication_slots. > > > > That removes the risk of running out of "replication slot stat slots": > > If we loose a drop message, the index eventually will be reused and we > > likely can detect that the stats were for a different slot by comparing > > the slot name. > > > > It also makes it easy to handle the issue of max_replication_slots being > > lowered and there still being stats for a slot - we simply can skip > > restoring that slots data, because we know the relevant slot can't exist > > anymore. And we can make the initial pgstat_report_replslot() during > > slot creation use a > > > > I'm wondering if we should just remove the slot name entirely from the > > pgstat.c side of things, and have pg_stat_get_replication_slots() > > inquire about slots by index as well and get the list of slots to report > > stats for from slot.c infrastructure. > > We fixed the problem of "running out of replication slot stat slots" > by using HTAB to store slot stats, see 3fa17d37716. The slot stats > could be orphaned if a slot drop message gets lost. But we constantly > check and remove them in pgstat_vacuum_stat(). > Thanks for the summarization. I don't find anything that is left unaddressed. I think we can wait for a day or two to see if Andres or anyone else sees anything that is left unaddressed and then we can close the open item. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Sat, Mar 20, 2021 at 3:52 AM Andres Freund wrote: > > Hi, > > I started to write this as a reply to > https://postgr.es/m/20210318015105.dcfa4ceybdjubf2i%40alap3.anarazel.de > but I think it doesn't really fit under that header anymore. > > On 2021-03-17 18:51:05 -0700, Andres Freund wrote: > > It does make it easier for the shared memory stats patch, because if > > there's a fixed number + location, the relevant stats reporting doesn't > > need to go through a hashtable with the associated locking. I guess > > that may have colored my perception that it's better to just have a > > statically sized memory allocation for this. Noteworthy that SLRU stats > > are done in a fixed size allocation as well... Through a long discussion, all review comments pointed out here have been addressed. I summarized that individual review comments are fixed by which commit to avoid any confusion later. > > As part of reviewing the replication slot stats patch I looked at > replication slot stats a fair bit, and I've a few misgivings. First, > about the pgstat.c side of things: > > - If somehow slot stat drop messages got lost (remember pgstat > communication is lossy!), we'll just stop maintaining stats for slots > created later, because there'll eventually be no space for keeping > stats for another slot. > > - If max_replication_slots was lowered between a restart, > pgstat_read_statfile() will happily write beyond the end of > replSlotStats. > > - pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I > think pgstat.c has absolutely no business doing things on that level. > > - We do a linear search through all replication slots whenever receiving > stats for a slot. Even though there'd be a perfectly good index to > just use all throughout - the slots index itself. It looks to me like > slots stat reports can be fairly frequent in some workloads, so that > doesn't seem great. Fixed by 3fa17d37716. > > - PgStat_ReplSlotStats etc use slotname[NAMEDATALEN]. Why not just NameData? > > - pgstat_report_replslot() already has a lot of stats parameters, it > seems likely that we'll get more. Seems like we should just use a > struct of stats updates. Fixed by cca57c1d9bf. > > And then more generally about the feature: > - If a slot was used to stream out a large amount of changes (say an > initial data load), but then replication is interrupted before the > transaction is committed/aborted, stream_bytes will not reflect the > many gigabytes of data we may have sent. Fixed by 592f00f8d. > - I seems weird that we went to the trouble of inventing replication > slot stats, but then limit them to logical slots, and even there don't > record the obvious things like the total amount of data sent. Fixed by f5fc2f5b23d. > > I think the best way to address the more fundamental "pgstat related" > complaints is to change how replication slot stats are > "addressed". Instead of using the slots name, report stats using the > index in ReplicationSlotCtl->replication_slots. > > That removes the risk of running out of "replication slot stat slots": > If we loose a drop message, the index eventually will be reused and we > likely can detect that the stats were for a different slot by comparing > the slot name. > > It also makes it easy to handle the issue of max_replication_slots being > lowered and there still being stats for a slot - we simply can skip > restoring that slots data, because we know the relevant slot can't exist > anymore. And we can make the initial pgstat_report_replslot() during > slot creation use a > > I'm wondering if we should just remove the slot name entirely from the > pgstat.c side of things, and have pg_stat_get_replication_slots() > inquire about slots by index as well and get the list of slots to report > stats for from slot.c infrastructure. We fixed the problem of "running out of replication slot stat slots" by using HTAB to store slot stats, see 3fa17d37716. The slot stats could be orphaned if a slot drop message gets lost. But we constantly check and remove them in pgstat_vacuum_stat(). For the record, there are two known issues that are unlikely to happen in practice or don't affect users much: (1) if the messages for creation and drop slot of the same name get lost and create happens before (auto)vacuum cleans up the dead slot, the stats will be accumulated into the old slot, and (2) we could miss the total_txn and total_bytes updates if logical decoding is interrupted. For (1), there is an idea of having OIDs for each slot to avoid the accumulation of stats but that doesn't seem worth doing as in practice this won't happen frequently. For (2), there are some ideas of reporting slot stats at releasing slot (by keeping stats in ReplicationSlot or by using callback) but we decided to go with reporting slot stats after every stream/spill. Because it covers most cases in practice and is much simpler than other approaches. Regards, -- Masahiko
Re: Replication slot stats misgivings
On Thu, May 6, 2021 at 2:05 PM Masahiko Sawada wrote: > > On Thu, May 6, 2021 at 5:28 PM Amit Kapila wrote: > > > > On Thu, May 6, 2021 at 1:30 PM Masahiko Sawada > > wrote: > > > > > > All issues pointed out in this thread are resolved and we can remove > > > this item from the open items? > > > > > > > I think so. Do you think we should reply to Andres's original email > > stating the commits that fixed the individual review comments to avoid > > any confusion later? > > Good idea. That's also helpful for confirming that all comments are > addressed. Would you like to gather those commits? or shall I? > I am fine either way. I will do it tomorrow unless you have responded before that. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, May 6, 2021 at 5:28 PM Amit Kapila wrote: > > On Thu, May 6, 2021 at 1:30 PM Masahiko Sawada wrote: > > > > All issues pointed out in this thread are resolved and we can remove > > this item from the open items? > > > > I think so. Do you think we should reply to Andres's original email > stating the commits that fixed the individual review comments to avoid > any confusion later? Good idea. That's also helpful for confirming that all comments are addressed. Would you like to gather those commits? or shall I? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Thu, May 6, 2021 at 1:30 PM Masahiko Sawada wrote: > > On Thu, May 6, 2021 at 4:03 PM Amit Kapila wrote: > > > > On Thu, May 6, 2021 at 10:55 AM Masahiko Sawada > > wrote: > > > > > > On Thu, May 6, 2021 at 1:09 PM Amit Kapila > > > wrote: > > > > > > > > > > > In the attached, I have combined > > > > Vignesh's patch and your doc fix patch. Additionally, I have changed > > > > some comments and some other cosmetic stuff. Let me know what you > > > > think of the attached? > > > > > > Thank you for updating the patch. The patch looks good to me. > > > > > > > Pushed! Thanks for committing. > > All issues pointed out in this thread are resolved and we can remove > this item from the open items? I felt all the comments listed have been addressed. Regards, Vignesh
Re: Replication slot stats misgivings
On Thu, May 6, 2021 at 1:30 PM Masahiko Sawada wrote: > > All issues pointed out in this thread are resolved and we can remove > this item from the open items? > I think so. Do you think we should reply to Andres's original email stating the commits that fixed the individual review comments to avoid any confusion later? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, May 6, 2021 at 4:03 PM Amit Kapila wrote: > > On Thu, May 6, 2021 at 10:55 AM Masahiko Sawada wrote: > > > > On Thu, May 6, 2021 at 1:09 PM Amit Kapila wrote: > > > > > > > > In the attached, I have combined > > > Vignesh's patch and your doc fix patch. Additionally, I have changed > > > some comments and some other cosmetic stuff. Let me know what you > > > think of the attached? > > > > Thank you for updating the patch. The patch looks good to me. > > > > Pushed! Thanks! All issues pointed out in this thread are resolved and we can remove this item from the open items? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Thu, May 6, 2021 at 10:55 AM Masahiko Sawada wrote: > > On Thu, May 6, 2021 at 1:09 PM Amit Kapila wrote: > > > > > In the attached, I have combined > > Vignesh's patch and your doc fix patch. Additionally, I have changed > > some comments and some other cosmetic stuff. Let me know what you > > think of the attached? > > Thank you for updating the patch. The patch looks good to me. > Pushed! -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, May 6, 2021 at 9:39 AM Amit Kapila wrote: > > On Thu, May 6, 2021 at 6:15 AM Masahiko Sawada wrote: > > > > After more thought, I'm concerned that my patch's approach might be > > invasive for PG14. Given that Vignesh’s patch would cover most cases, > > > > I am not sure if your patch is too invasive but OTOH I am also > convinced that Vignesh's patch covers most cases and is much simpler > so we can go ahead with that. In the attached, I have combined > Vignesh's patch and your doc fix patch. Additionally, I have changed > some comments and some other cosmetic stuff. Let me know what you > think of the attached? The updated patch looks good to me. Regards, Vignesh
Re: Replication slot stats misgivings
On Thu, May 6, 2021 at 1:09 PM Amit Kapila wrote: > > On Thu, May 6, 2021 at 6:15 AM Masahiko Sawada wrote: > > > > After more thought, I'm concerned that my patch's approach might be > > invasive for PG14. Given that Vignesh’s patch would cover most cases, > > > > I am not sure if your patch is too invasive but OTOH I am also > convinced that Vignesh's patch covers most cases and is much simpler > so we can go ahead with that. I think that my patch affects also other codes including logical decoding and decoding context. We will need to write code while worrying about MyLogicalDecodingContext. > In the attached, I have combined > Vignesh's patch and your doc fix patch. Additionally, I have changed > some comments and some other cosmetic stuff. Let me know what you > think of the attached? Thank you for updating the patch. The patch looks good to me. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Thu, May 6, 2021 at 6:15 AM Masahiko Sawada wrote: > > After more thought, I'm concerned that my patch's approach might be > invasive for PG14. Given that Vignesh’s patch would cover most cases, > I am not sure if your patch is too invasive but OTOH I am also convinced that Vignesh's patch covers most cases and is much simpler so we can go ahead with that. In the attached, I have combined Vignesh's patch and your doc fix patch. Additionally, I have changed some comments and some other cosmetic stuff. Let me know what you think of the attached? -- With Regards, Amit Kapila. v5-0001-Update-replication-statistics-after-every-stream-.patch Description: Binary data
Re: Replication slot stats misgivings
On Mon, May 3, 2021 at 9:18 PM Masahiko Sawada wrote: > > On Mon, May 3, 2021 at 2:27 PM Amit Kapila wrote: > > > Apart from this, I think you > > have suggested somewhere in this thread to slightly update the > > description of stream_bytes. I would like to update the description of > > stream_bytes and total_bytes as below: > > > > stream_bytes > > Amount of transaction data decoded for streaming in-progress > > transactions to the decoding output plugin while decoding changes from > > WAL for this slot. This and other streaming counters for this slot can > > be used to tune logical_decoding_work_mem. > > > > total_bytes > > Amount of transaction data decoded for sending transactions to the > > decoding output plugin while decoding changes from WAL for this slot. > > Note that this includes data that is streamed and/or spilled. > > > > This update considers two points: > > a. we don't send this data across the network because plugin might > > decide to filter this data, ex. based on publications. > > b. not all of the decoded changes are sent to plugin, consider > > REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID, > > REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT, etc. > > Looks good to me. Attached the doc update patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ fix_doc.patch Description: Binary data
Re: Replication slot stats misgivings
On Tue, May 4, 2021 at 2:34 PM vignesh C wrote: > > On Tue, May 4, 2021 at 9:48 AM Masahiko Sawada wrote: > > > > On Mon, May 3, 2021 at 10:21 PM Amit Kapila wrote: > > > > > > On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada > > > wrote: > > > > > > > > On Mon, May 3, 2021 at 2:27 PM Amit Kapila > > > > wrote: > > > > > > > > > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila > > > > > wrote: > > > > > > > > > > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada > > > > > > wrote: > > > > > > > > > > > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > I am not sure if any of these alternatives are a good idea. > > > > > > > > What do > > > > > > > > you think? Do you have any other ideas for this? > > > > > > > > > > > > > > I've been considering some ideas but don't come up with a good one > > > > > > > yet. It’s just an idea and not tested but how about having > > > > > > > CreateDecodingContext() register before_shmem_exit() callback > > > > > > > with the > > > > > > > decoding context to ensure that we send slot stats even on > > > > > > > interruption. And FreeDecodingContext() cancels the callback. > > > > > > > > > > > > > > > > > > > Is it a good idea to send stats while exiting and rely on the same? > > > > > > I > > > > > > think before_shmem_exit is mostly used for the cleanup purpose so > > > > > > not > > > > > > sure if we can rely on it for this purpose. I think we can't be sure > > > > > > that in all cases we will send all stats, so maybe Vignesh's patch > > > > > > is > > > > > > sufficient to cover the cases where we avoid losing it in cases > > > > > > where > > > > > > we would have sent a large amount of data. > > > > > > > > > > > > > > > > Sawada-San, any thoughts on this point? > > > > > > > > before_shmem_exit is mostly used to the cleanup purpose but how about > > > > on_shmem_exit()? pgstats relies on that to send stats at the > > > > interruption. See pgstat_shutdown_hook(). > > > > > > > > > > Yeah, that is worth trying. Would you like to give it a try? > > > > Yes. > > > > In this approach, I think we will need to have a static pointer in > > logical.c pointing to LogicalDecodingContext that we’re using. At > > StartupDecodingContext(), we set the pointer to the just created > > LogicalDecodingContext and register the callback so that we can refer > > to the LogicalDecodingContext on that callback. And at > > FreeDecodingContext(), we reset the pointer to NULL (however, since > > FreeDecodingContext() is not called when an error happens we would > > need to ensure resetting it somehow). But, after more thought, if we > > have the static pointer in logical.c it would rather be better to have > > a global function that sends slot stats based on the > > LogicalDecodingContext pointed by the static pointer and can be called > > by ReplicationSlotRelease(). That way, we don’t need to worry about > > erroring out cases as well as interruption cases, although we need to > > have a new static pointer. > > > > I've attached a quick-hacked patch. I also incorporated the change > > that calls UpdateDecodingStats() at FreeDecodingContext() so that we > > can send slot stats also in the case where we spilled/streamed changes > > but finished without commit/abort/prepare record. > > > > > I think > > > it still might not cover the cases where we error out in the backend > > > while decoding via APIs because at that time we won't exit, maybe for > > > that we can consider Vignesh's patch. > > > > Agreed. It seems to me that the approach of the attached patch is > > better than the approach using on_shmem_exit(). So if we want to avoid > > having the new static pointer and function for this purpose we can > > consider Vignesh’s patch. > > > > I'm ok with using either my patch or Sawada san's patch, Even I had > the same thought of whether we should have a static variable thought > pointed out by Sawada san. Apart from that I had one minor comment: > This comment needs to be corrected "andu sed to sent" > +/* > + * Pointing to the currently-used logical decoding context andu sed to sent > + * slot statistics on releasing slots. > + */ > +static LogicalDecodingContext *MyLogicalDecodingContext = NULL; > + Right, that needs to be fixed. After more thought, I'm concerned that my patch's approach might be invasive for PG14. Given that Vignesh’s patch would cover most cases, I think we can live with a small downside that could miss some slot stats. If we want to ensure sending slot stats at releasing slot, we can develop it as an improvement. My patch would be better to get reviewed by more peoples including the design during PG15 development. Thoughts? Regards, --- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Tue, May 4, 2021 at 9:48 AM Masahiko Sawada wrote: > > On Mon, May 3, 2021 at 10:21 PM Amit Kapila wrote: > > > > On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada > > wrote: > > > > > > On Mon, May 3, 2021 at 2:27 PM Amit Kapila > > > wrote: > > > > > > > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila > > > > wrote: > > > > > > > > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada > > > > > wrote: > > > > > > > > > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > I am not sure if any of these alternatives are a good idea. What > > > > > > > do > > > > > > > you think? Do you have any other ideas for this? > > > > > > > > > > > > I've been considering some ideas but don't come up with a good one > > > > > > yet. It’s just an idea and not tested but how about having > > > > > > CreateDecodingContext() register before_shmem_exit() callback with > > > > > > the > > > > > > decoding context to ensure that we send slot stats even on > > > > > > interruption. And FreeDecodingContext() cancels the callback. > > > > > > > > > > > > > > > > Is it a good idea to send stats while exiting and rely on the same? I > > > > > think before_shmem_exit is mostly used for the cleanup purpose so not > > > > > sure if we can rely on it for this purpose. I think we can't be sure > > > > > that in all cases we will send all stats, so maybe Vignesh's patch is > > > > > sufficient to cover the cases where we avoid losing it in cases where > > > > > we would have sent a large amount of data. > > > > > > > > > > > > > Sawada-San, any thoughts on this point? > > > > > > before_shmem_exit is mostly used to the cleanup purpose but how about > > > on_shmem_exit()? pgstats relies on that to send stats at the > > > interruption. See pgstat_shutdown_hook(). > > > > > > > Yeah, that is worth trying. Would you like to give it a try? > > Yes. > > In this approach, I think we will need to have a static pointer in > logical.c pointing to LogicalDecodingContext that we’re using. At > StartupDecodingContext(), we set the pointer to the just created > LogicalDecodingContext and register the callback so that we can refer > to the LogicalDecodingContext on that callback. And at > FreeDecodingContext(), we reset the pointer to NULL (however, since > FreeDecodingContext() is not called when an error happens we would > need to ensure resetting it somehow). But, after more thought, if we > have the static pointer in logical.c it would rather be better to have > a global function that sends slot stats based on the > LogicalDecodingContext pointed by the static pointer and can be called > by ReplicationSlotRelease(). That way, we don’t need to worry about > erroring out cases as well as interruption cases, although we need to > have a new static pointer. > > I've attached a quick-hacked patch. I also incorporated the change > that calls UpdateDecodingStats() at FreeDecodingContext() so that we > can send slot stats also in the case where we spilled/streamed changes > but finished without commit/abort/prepare record. > > > I think > > it still might not cover the cases where we error out in the backend > > while decoding via APIs because at that time we won't exit, maybe for > > that we can consider Vignesh's patch. > > Agreed. It seems to me that the approach of the attached patch is > better than the approach using on_shmem_exit(). So if we want to avoid > having the new static pointer and function for this purpose we can > consider Vignesh’s patch. > I'm ok with using either my patch or Sawada san's patch, Even I had the same thought of whether we should have a static variable thought pointed out by Sawada san. Apart from that I had one minor comment: This comment needs to be corrected "andu sed to sent" +/* + * Pointing to the currently-used logical decoding context andu sed to sent + * slot statistics on releasing slots. + */ +static LogicalDecodingContext *MyLogicalDecodingContext = NULL; + Regards, Vignesh
Re: Replication slot stats misgivings
On Mon, May 3, 2021 at 10:21 PM Amit Kapila wrote: > > On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada wrote: > > > > On Mon, May 3, 2021 at 2:27 PM Amit Kapila wrote: > > > > > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila > > > wrote: > > > > > > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada > > > > wrote: > > > > > > > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila > > > > > wrote: > > > > > > > > > > > > > > > > > I am not sure if any of these alternatives are a good idea. What do > > > > > > you think? Do you have any other ideas for this? > > > > > > > > > > I've been considering some ideas but don't come up with a good one > > > > > yet. It’s just an idea and not tested but how about having > > > > > CreateDecodingContext() register before_shmem_exit() callback with the > > > > > decoding context to ensure that we send slot stats even on > > > > > interruption. And FreeDecodingContext() cancels the callback. > > > > > > > > > > > > > Is it a good idea to send stats while exiting and rely on the same? I > > > > think before_shmem_exit is mostly used for the cleanup purpose so not > > > > sure if we can rely on it for this purpose. I think we can't be sure > > > > that in all cases we will send all stats, so maybe Vignesh's patch is > > > > sufficient to cover the cases where we avoid losing it in cases where > > > > we would have sent a large amount of data. > > > > > > > > > > Sawada-San, any thoughts on this point? > > > > before_shmem_exit is mostly used to the cleanup purpose but how about > > on_shmem_exit()? pgstats relies on that to send stats at the > > interruption. See pgstat_shutdown_hook(). > > > > Yeah, that is worth trying. Would you like to give it a try? Yes. In this approach, I think we will need to have a static pointer in logical.c pointing to LogicalDecodingContext that we’re using. At StartupDecodingContext(), we set the pointer to the just created LogicalDecodingContext and register the callback so that we can refer to the LogicalDecodingContext on that callback. And at FreeDecodingContext(), we reset the pointer to NULL (however, since FreeDecodingContext() is not called when an error happens we would need to ensure resetting it somehow). But, after more thought, if we have the static pointer in logical.c it would rather be better to have a global function that sends slot stats based on the LogicalDecodingContext pointed by the static pointer and can be called by ReplicationSlotRelease(). That way, we don’t need to worry about erroring out cases as well as interruption cases, although we need to have a new static pointer. I've attached a quick-hacked patch. I also incorporated the change that calls UpdateDecodingStats() at FreeDecodingContext() so that we can send slot stats also in the case where we spilled/streamed changes but finished without commit/abort/prepare record. > I think > it still might not cover the cases where we error out in the backend > while decoding via APIs because at that time we won't exit, maybe for > that we can consider Vignesh's patch. Agreed. It seems to me that the approach of the attached patch is better than the approach using on_shmem_exit(). So if we want to avoid having the new static pointer and function for this purpose we can consider Vignesh’s patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c index 00543ede45..f32a2da565 100644 --- a/src/backend/replication/logical/logical.c +++ b/src/backend/replication/logical/logical.c @@ -51,6 +51,12 @@ typedef struct LogicalErrorCallbackState XLogRecPtr report_location; } LogicalErrorCallbackState; +/* + * Pointing to the currently-used logical decoding context andu sed to sent + * slot statistics on releasing slots. + */ +static LogicalDecodingContext *MyLogicalDecodingContext = NULL; + /* wrappers around output plugin callbacks */ static void output_plugin_error_callback(void *arg); static void startup_cb_wrapper(LogicalDecodingContext *ctx, OutputPluginOptions *opt, @@ -290,6 +296,13 @@ StartupDecodingContext(List *output_plugin_options, MemoryContextSwitchTo(old_context); + /* + * Keep holding the decoding context until freeing the decoding context + * or releasing the logical slot. + */ + Assert(MyLogicalDecodingContext == NULL); + MyLogicalDecodingContext = ctx; + return ctx; } @@ -626,10 +639,12 @@ FreeDecodingContext(LogicalDecodingContext *ctx) if (ctx->callbacks.shutdown_cb != NULL) shutdown_cb_wrapper(ctx); + UpdateDecodingStats(ctx); ReorderBufferFree(ctx->reorder); FreeSnapshotBuilder(ctx->snapshot_builder); XLogReaderFree(ctx->reader); MemoryContextDelete(ctx->context); + MyLogicalDecodingContext = NULL; } /* @@ -1811,3 +1826,17 @@ UpdateDecodingStats(LogicalDecodingContext *ctx) rb->totalTxns = 0; rb->totalBytes = 0; } + +/* + * The function called at releasing a logical replication slot, sending
Re: Replication slot stats misgivings
On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada wrote: > > On Mon, May 3, 2021 at 2:27 PM Amit Kapila wrote: > > > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila > > wrote: > > > > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada > > > wrote: > > > > > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila > > > > wrote: > > > > > > > > > > > > > > I am not sure if any of these alternatives are a good idea. What do > > > > > you think? Do you have any other ideas for this? > > > > > > > > I've been considering some ideas but don't come up with a good one > > > > yet. It’s just an idea and not tested but how about having > > > > CreateDecodingContext() register before_shmem_exit() callback with the > > > > decoding context to ensure that we send slot stats even on > > > > interruption. And FreeDecodingContext() cancels the callback. > > > > > > > > > > Is it a good idea to send stats while exiting and rely on the same? I > > > think before_shmem_exit is mostly used for the cleanup purpose so not > > > sure if we can rely on it for this purpose. I think we can't be sure > > > that in all cases we will send all stats, so maybe Vignesh's patch is > > > sufficient to cover the cases where we avoid losing it in cases where > > > we would have sent a large amount of data. > > > > > > > Sawada-San, any thoughts on this point? > > before_shmem_exit is mostly used to the cleanup purpose but how about > on_shmem_exit()? pgstats relies on that to send stats at the > interruption. See pgstat_shutdown_hook(). > Yeah, that is worth trying. Would you like to give it a try? I think it still might not cover the cases where we error out in the backend while decoding via APIs because at that time we won't exit, maybe for that we can consider Vignesh's patch. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Mon, May 3, 2021 at 2:29 PM Amit Kapila wrote: > > On Fri, Apr 30, 2021 at 1:47 PM Amit Kapila wrote: > > > > LGTM. I have slightly edited the comments in the attached. I'll push > > this early next week unless there are more comments. > > > > Pushed. Thank you! Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Mon, May 3, 2021 at 2:27 PM Amit Kapila wrote: > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila wrote: > > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada > > wrote: > > > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila > > > wrote: > > > > > > > > > > > I am not sure if any of these alternatives are a good idea. What do > > > > you think? Do you have any other ideas for this? > > > > > > I've been considering some ideas but don't come up with a good one > > > yet. It’s just an idea and not tested but how about having > > > CreateDecodingContext() register before_shmem_exit() callback with the > > > decoding context to ensure that we send slot stats even on > > > interruption. And FreeDecodingContext() cancels the callback. > > > > > > > Is it a good idea to send stats while exiting and rely on the same? I > > think before_shmem_exit is mostly used for the cleanup purpose so not > > sure if we can rely on it for this purpose. I think we can't be sure > > that in all cases we will send all stats, so maybe Vignesh's patch is > > sufficient to cover the cases where we avoid losing it in cases where > > we would have sent a large amount of data. > > > > Sawada-San, any thoughts on this point? before_shmem_exit is mostly used to the cleanup purpose but how about on_shmem_exit()? pgstats relies on that to send stats at the interruption. See pgstat_shutdown_hook(). That being said, I agree Vignesh' patch would cover most cases. If we don't find any better solution, I think we can go with Vignesh's patch. > Apart from this, I think you > have suggested somewhere in this thread to slightly update the > description of stream_bytes. I would like to update the description of > stream_bytes and total_bytes as below: > > stream_bytes > Amount of transaction data decoded for streaming in-progress > transactions to the decoding output plugin while decoding changes from > WAL for this slot. This and other streaming counters for this slot can > be used to tune logical_decoding_work_mem. > > total_bytes > Amount of transaction data decoded for sending transactions to the > decoding output plugin while decoding changes from WAL for this slot. > Note that this includes data that is streamed and/or spilled. > > This update considers two points: > a. we don't send this data across the network because plugin might > decide to filter this data, ex. based on publications. > b. not all of the decoded changes are sent to plugin, consider > REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID, > REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT, etc. Looks good to me. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Fri, Apr 30, 2021 at 1:47 PM Amit Kapila wrote: > > LGTM. I have slightly edited the comments in the attached. I'll push > this early next week unless there are more comments. > Pushed. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila wrote: > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada wrote: > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila wrote: > > > > > > > > I am not sure if any of these alternatives are a good idea. What do > > > you think? Do you have any other ideas for this? > > > > I've been considering some ideas but don't come up with a good one > > yet. It’s just an idea and not tested but how about having > > CreateDecodingContext() register before_shmem_exit() callback with the > > decoding context to ensure that we send slot stats even on > > interruption. And FreeDecodingContext() cancels the callback. > > > > Is it a good idea to send stats while exiting and rely on the same? I > think before_shmem_exit is mostly used for the cleanup purpose so not > sure if we can rely on it for this purpose. I think we can't be sure > that in all cases we will send all stats, so maybe Vignesh's patch is > sufficient to cover the cases where we avoid losing it in cases where > we would have sent a large amount of data. > Sawada-San, any thoughts on this point? Apart from this, I think you have suggested somewhere in this thread to slightly update the description of stream_bytes. I would like to update the description of stream_bytes and total_bytes as below: stream_bytes Amount of transaction data decoded for streaming in-progress transactions to the decoding output plugin while decoding changes from WAL for this slot. This and other streaming counters for this slot can be used to tune logical_decoding_work_mem. total_bytes Amount of transaction data decoded for sending transactions to the decoding output plugin while decoding changes from WAL for this slot. Note that this includes data that is streamed and/or spilled. This update considers two points: a. we don't send this data across the network because plugin might decide to filter this data, ex. based on publications. b. not all of the decoded changes are sent to plugin, consider REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID, REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT, etc. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Fri, Apr 30, 2021 at 5:55 AM Masahiko Sawada wrote: > > After more thought, it seems to me that we should use txn->size here > regardless of the top transaction or subtransaction since we're > iterating changes associated with a transaction that is either the top > transaction or a subtransaction. Otherwise, I think if some > subtransactions are not serialized, we will end up adding bytes > including those subtransactions during iterating other serialized > subtransactions. Whereas in ReorderBufferProcessTXN() we should use > txn->total_txn since txn is always the top transaction. I've attached > another patch to do this. > LGTM. I have slightly edited the comments in the attached. I'll push this early next week unless there are more comments. -- With Regards, Amit Kapila. use_total_size_v6.patch Description: Binary data
Re: Replication slot stats misgivings
On Fri, Apr 30, 2021 at 5:55 AM Masahiko Sawada wrote: > > On Thu, Apr 29, 2021 at 9:44 PM vignesh C wrote: > > > > > > > > On Thu, Apr 29, 2021 at 3:06 PM Amit Kapila wrote: > > > > > > On Wed, Apr 28, 2021 at 5:01 PM Amit Kapila > > > wrote: > > > > > > > > On Wed, Apr 28, 2021 at 4:51 PM Masahiko Sawada > > > > wrote: > > > > > > > > > > On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila > > > > > wrote: > > > > > > > > @@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, > > > > ReorderBufferIterTXNState *state) > > > > * Update the total bytes processed before releasing the current set > > > > * of changes and restoring the new set of changes. > > > > */ > > > > - rb->totalBytes += rb->size; > > > > + rb->totalBytes += entry->txn->total_size; > > > > if (ReorderBufferRestoreChanges(rb, entry->txn, >file, > > > > >entries[off].segno)) > > > > > > > > I have not tested this but won't in the above change you need to check > > > > txn->toptxn for subtxns? > > > > > > > > > > Now, I am able to reproduce this issue: > > > Create table t1(c1 int); > > > select pg_create_logical_replication_slot('s', 'test_decoding'); > > > Begin; > > > insert into t1 values(1); > > > savepoint s1; > > > insert into t1 select generate_series(1, 10); > > > commit; > > > > > > postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, > > > NULL); > > > count > > > > > > 15 > > > (1 row) > > > > > > postgres=# select * from pg_stat_replication_slots; > > > slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > > > stream_count | stream_bytes | total_txns | total_bytes | > > > stats_reset > > > ---++-+-+-+--+--++-+-- > > > s1| 0 | 0 | 0 | 0 | > > > 0 |0 | 2 |13200672 | 2021-04-29 > > > 14:33:55.156566+05:30 > > > (1 row) > > > > > > select * from pg_stat_reset_replication_slot('s1'); > > > > > > Now reduce the logical decoding work mem to allow spilling. > > > postgres=# set logical_decoding_work_mem='64kB'; > > > SET > > > postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, > > > NULL); > > > count > > > > > > 15 > > > (1 row) > > > > > > postgres=# select * from pg_stat_replication_slots; > > > slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > > > stream_count | stream_bytes | total_txns | total_bytes | > > > stats_reset > > > ---++-+-+-+--+--++-+-- > > > s1| 1 | 202 |1320 | 0 | > > > 0 |0 | 2 | 672 | 2021-04-29 > > > 14:35:21.836613+05:30 > > > (1 row) > > > > > > You can notice that after we have allowed spilling the 'total_bytes' > > > stats is showing a different value. The attached patch fixes the issue > > > for me. Let me know what do you think about this? > > > > I found one issue with the following scenario when testing with > > logical_decoding_work_mem as 64kB: > > > > BEGIN; > > INSERT INTO t1 values(generate_series(1,1)); > > SAVEPOINT s1; > > INSERT INTO t1 values(generate_series(1,1)); > > COMMIT; > > SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot1', NULL, > > NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); > > select * from pg_stat_replication_slots; > > slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > > stream_count | stream_bytes | total_txns | total_bytes | > > stats_reset > > --++-+-+-+--+--++-+-- > > regression_slot1 | 6 | 154 | 9130176 | 0 | > > 0 |0 | 1 | 4262016 | 2021-04-29 > > 17:50:00.080663+05:30 > > (1 row) > > > > Same thing works fine with logical_decoding_work_mem as 64MB: > > select * from pg_stat_replication_slots; > >slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > > stream_count | stream_bytes | total_txns | total_bytes | > > stats_reset > > --++-+-+-+--+--++-+-- > > regression_slot1 | 6 | 154 | 9130176 | 0 | > > 0 |0 | 1 | 264 | 2021-04-29 > > 17:50:00.080663+05:30 > > (1 row) > > > > The patch required one change: > > - rb->totalBytes += rb->size; > > + if (entry->txn->toptxn) > > + rb->totalBytes += entry->txn->toptxn->total_size; > > + else > > + rb->totalBytes += entry->txn->total_size; >
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 12:07 PM Amit Kapila wrote: > > On Thu, Apr 29, 2021 at 11:14 AM Masahiko Sawada > wrote: > > > > > > > > How about doing both of the above suggestions? Alternatively, we can > > > wait for both 'drop' and 'create' message to be delivered but that > > > might be overkill. > > > > Agreed. Attached the patch doing both things. > > > > Thanks, the patch LGTM. I'll wait for a day before committing to see > if anyone has better ideas. > Pushed. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 9:44 PM vignesh C wrote: > > > > On Thu, Apr 29, 2021 at 3:06 PM Amit Kapila wrote: > > > > On Wed, Apr 28, 2021 at 5:01 PM Amit Kapila wrote: > > > > > > On Wed, Apr 28, 2021 at 4:51 PM Masahiko Sawada > > > wrote: > > > > > > > > On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila > > > > wrote: > > > > > > @@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, > > > ReorderBufferIterTXNState *state) > > > * Update the total bytes processed before releasing the current set > > > * of changes and restoring the new set of changes. > > > */ > > > - rb->totalBytes += rb->size; > > > + rb->totalBytes += entry->txn->total_size; > > > if (ReorderBufferRestoreChanges(rb, entry->txn, >file, > > > >entries[off].segno)) > > > > > > I have not tested this but won't in the above change you need to check > > > txn->toptxn for subtxns? > > > > > > > Now, I am able to reproduce this issue: > > Create table t1(c1 int); > > select pg_create_logical_replication_slot('s', 'test_decoding'); > > Begin; > > insert into t1 values(1); > > savepoint s1; > > insert into t1 select generate_series(1, 10); > > commit; > > > > postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, > > NULL); > > count > > > > 15 > > (1 row) > > > > postgres=# select * from pg_stat_replication_slots; > > slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > > stream_count | stream_bytes | total_txns | total_bytes | > > stats_reset > > ---++-+-+-+--+--++-+-- > > s1| 0 | 0 | 0 | 0 | > > 0 |0 | 2 |13200672 | 2021-04-29 > > 14:33:55.156566+05:30 > > (1 row) > > > > select * from pg_stat_reset_replication_slot('s1'); > > > > Now reduce the logical decoding work mem to allow spilling. > > postgres=# set logical_decoding_work_mem='64kB'; > > SET > > postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, > > NULL); > > count > > > > 15 > > (1 row) > > > > postgres=# select * from pg_stat_replication_slots; > > slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > > stream_count | stream_bytes | total_txns | total_bytes | > > stats_reset > > ---++-+-+-+--+--++-+-- > > s1| 1 | 202 |1320 | 0 | > > 0 |0 | 2 | 672 | 2021-04-29 > > 14:35:21.836613+05:30 > > (1 row) > > > > You can notice that after we have allowed spilling the 'total_bytes' > > stats is showing a different value. The attached patch fixes the issue > > for me. Let me know what do you think about this? > > I found one issue with the following scenario when testing with > logical_decoding_work_mem as 64kB: > > BEGIN; > INSERT INTO t1 values(generate_series(1,1)); > SAVEPOINT s1; > INSERT INTO t1 values(generate_series(1,1)); > COMMIT; > SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot1', NULL, > NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); > select * from pg_stat_replication_slots; > slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > stream_count | stream_bytes | total_txns | total_bytes | stats_reset > --++-+-+-+--+--++-+-- > regression_slot1 | 6 | 154 | 9130176 | 0 | > 0 |0 | 1 | 4262016 | 2021-04-29 > 17:50:00.080663+05:30 > (1 row) > > Same thing works fine with logical_decoding_work_mem as 64MB: > select * from pg_stat_replication_slots; >slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > stream_count | stream_bytes | total_txns | total_bytes | stats_reset > --++-+-+-+--+--++-+-- > regression_slot1 | 6 | 154 | 9130176 | 0 | > 0 |0 | 1 | 264 | 2021-04-29 > 17:50:00.080663+05:30 > (1 row) > > The patch required one change: > - rb->totalBytes += rb->size; > + if (entry->txn->toptxn) > + rb->totalBytes += entry->txn->toptxn->total_size; > + else > + rb->totalBytes += entry->txn->total_size; > > The above should be changed to: > - rb->totalBytes += rb->size; > + if (entry->txn->toptxn) > + rb->totalBytes += entry->txn->toptxn->total_size; > + else > + rb->totalBytes += entry->txn->size; > > Attached patch fixes the issue. > Thoughts? After more thought, it seems to me that we should use
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 3:06 PM Amit Kapila wrote: > > On Wed, Apr 28, 2021 at 5:01 PM Amit Kapila wrote: > > > > On Wed, Apr 28, 2021 at 4:51 PM Masahiko Sawada wrote: > > > > > > On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila wrote: > > > > @@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, > > ReorderBufferIterTXNState *state) > > * Update the total bytes processed before releasing the current set > > * of changes and restoring the new set of changes. > > */ > > - rb->totalBytes += rb->size; > > + rb->totalBytes += entry->txn->total_size; > > if (ReorderBufferRestoreChanges(rb, entry->txn, >file, > > >entries[off].segno)) > > > > I have not tested this but won't in the above change you need to check > > txn->toptxn for subtxns? > > > > Now, I am able to reproduce this issue: > Create table t1(c1 int); > select pg_create_logical_replication_slot('s', 'test_decoding'); > Begin; > insert into t1 values(1); > savepoint s1; > insert into t1 select generate_series(1, 10); > commit; > > postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL); > count > > 15 > (1 row) > > postgres=# select * from pg_stat_replication_slots; > slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > stream_count | stream_bytes | total_txns | total_bytes | > stats_reset > ---++-+-+-+--+--++-+-- > s1| 0 | 0 | 0 | 0 | > 0 |0 | 2 |13200672 | 2021-04-29 > 14:33:55.156566+05:30 > (1 row) > > select * from pg_stat_reset_replication_slot('s1'); > > Now reduce the logical decoding work mem to allow spilling. > postgres=# set logical_decoding_work_mem='64kB'; > SET > postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL); > count > > 15 > (1 row) > > postgres=# select * from pg_stat_replication_slots; > slot_name | spill_txns | spill_count | spill_bytes | stream_txns | > stream_count | stream_bytes | total_txns | total_bytes | > stats_reset > ---++-+-+-+--+--++-+-- > s1| 1 | 202 |1320 | 0 | > 0 |0 | 2 | 672 | 2021-04-29 > 14:35:21.836613+05:30 > (1 row) > > You can notice that after we have allowed spilling the 'total_bytes' > stats is showing a different value. The attached patch fixes the issue > for me. Let me know what do you think about this? I found one issue with the following scenario when testing with logical_decoding_work_mem as 64kB: BEGIN; INSERT INTO t1 values(generate_series(1,1)); SAVEPOINT s1; INSERT INTO t1 values(generate_series(1,1)); COMMIT; SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot1', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); select * from pg_stat_replication_slots; slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | total_txns | total_bytes | stats_reset --++-+-+-+--+--++-+-- regression_slot1 | 6 | 154 | 9130176 | 0 | 0 |0 | 1 | *4262016* | 2021-04-29 17:50:00.080663+05:30 (1 row) Same thing works fine with logical_decoding_work_mem as 64MB: select * from pg_stat_replication_slots; slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | total_txns | total_bytes | stats_reset --++-+-+-+--+--++-+-- regression_slot1 | 6 | 154 | 9130176 | 0 | 0 |0 | 1 | *264* | 2021-04-29 17:50:00.080663+05:30 (1 row) The patch required one change: - rb->totalBytes += rb->size; + if (entry->txn->toptxn) + rb->totalBytes += entry->txn->toptxn->total_size; + else + rb->totalBytes += entry->txn->*total_size*; The above should be changed to: - rb->totalBytes += rb->size; + if (entry->txn->toptxn) + rb->totalBytes += entry->txn->toptxn->total_size; + else + rb->totalBytes += entry->txn->*size*; Attached patch fixes the issue. Thoughts? Regards, Vignesh diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index c27f710053..cdf46a36af 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -1369,7 +1369,10 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, ReorderBufferIterTXNState *state) * Update
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 5:01 PM Amit Kapila wrote: > > On Wed, Apr 28, 2021 at 4:51 PM Masahiko Sawada wrote: > > > > On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila wrote: > > @@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, > ReorderBufferIterTXNState *state) > * Update the total bytes processed before releasing the current set > * of changes and restoring the new set of changes. > */ > - rb->totalBytes += rb->size; > + rb->totalBytes += entry->txn->total_size; > if (ReorderBufferRestoreChanges(rb, entry->txn, >file, > >entries[off].segno)) > > I have not tested this but won't in the above change you need to check > txn->toptxn for subtxns? > Now, I am able to reproduce this issue: Create table t1(c1 int); select pg_create_logical_replication_slot('s', 'test_decoding'); Begin; insert into t1 values(1); savepoint s1; insert into t1 select generate_series(1, 10); commit; postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL); count 15 (1 row) postgres=# select * from pg_stat_replication_slots; slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | total_txns | total_bytes | stats_reset ---++-+-+-+--+--++-+-- s1| 0 | 0 | 0 | 0 | 0 |0 | 2 |13200672 | 2021-04-29 14:33:55.156566+05:30 (1 row) select * from pg_stat_reset_replication_slot('s1'); Now reduce the logical decoding work mem to allow spilling. postgres=# set logical_decoding_work_mem='64kB'; SET postgres=# select count(*) from pg_logical_slot_peek_changes('s1', NULL, NULL); count 15 (1 row) postgres=# select * from pg_stat_replication_slots; slot_name | spill_txns | spill_count | spill_bytes | stream_txns | stream_count | stream_bytes | total_txns | total_bytes | stats_reset ---++-+-+-+--+--++-+-- s1| 1 | 202 |1320 | 0 | 0 |0 | 2 | 672 | 2021-04-29 14:35:21.836613+05:30 (1 row) You can notice that after we have allowed spilling the 'total_bytes' stats is showing a different value. The attached patch fixes the issue for me. Let me know what do you think about this? -- With Regards, Amit Kapila. use_total_size_v3.patch Description: Binary data
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 11:14 AM Masahiko Sawada wrote: > > On Thu, Apr 29, 2021 at 11:55 AM Amit Kapila wrote: > > > > On Thu, Apr 29, 2021 at 4:58 AM Masahiko Sawada > > wrote: > > > > > > On Thu, Apr 29, 2021 at 5:41 AM Tom Lane wrote: > > > > > > > > It seems that the test case added by f5fc2f5b2 is still a bit > > > > unstable, even after c64dcc7fe: > > > > > > Hmm, I don't see the exact cause yet but there are two possibilities: > > > some transactions were really spilled, > > > > > > > This is the first test and inserts just one small record, so how it > > can lead to spill of data. Do you mean to say that may be some > > background process has written some transaction which leads to a spill > > of data? > > Not sure but I thought that the logical decoding started to decodes > from a relatively old point for some reason and decoded incomplete > transactions that weren’t shown in the result. > > > > > > and it showed the old stats due > > > to losing the drop (and create) slot messages. > > > > > > > Yeah, something like this could happen. Another possibility here could > > be that before the stats collector has processed drop and create > > messages, we have enquired about the stats which lead to it giving us > > the old stats. Note, that we don't wait for 'drop' or 'create' message > > to be delivered. So, there is a possibility of the same. What do you > > think? > > Yeah, that could happen even if any message didn't get dropped. > > > > > > For the former case, it > > > seems to better to create the slot just before the insertion and > > > setting logical_decoding_work_mem to the default (64MB). For the > > > latter case, maybe we can use a different name slot than the name used > > > in other tests? > > > > > > > How about doing both of the above suggestions? Alternatively, we can > > wait for both 'drop' and 'create' message to be delivered but that > > might be overkill. > > Agreed. Attached the patch doing both things. Having a different slot name should solve the problem. The patch looks good to me. Regards, Vignesh
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 11:14 AM Masahiko Sawada wrote: > > > > > How about doing both of the above suggestions? Alternatively, we can > > wait for both 'drop' and 'create' message to be delivered but that > > might be overkill. > > Agreed. Attached the patch doing both things. > Thanks, the patch LGTM. I'll wait for a day before committing to see if anyone has better ideas. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 11:55 AM Amit Kapila wrote: > > On Thu, Apr 29, 2021 at 4:58 AM Masahiko Sawada wrote: > > > > On Thu, Apr 29, 2021 at 5:41 AM Tom Lane wrote: > > > > > > It seems that the test case added by f5fc2f5b2 is still a bit > > > unstable, even after c64dcc7fe: > > > > Hmm, I don't see the exact cause yet but there are two possibilities: > > some transactions were really spilled, > > > > This is the first test and inserts just one small record, so how it > can lead to spill of data. Do you mean to say that may be some > background process has written some transaction which leads to a spill > of data? Not sure but I thought that the logical decoding started to decodes from a relatively old point for some reason and decoded incomplete transactions that weren’t shown in the result. > > > and it showed the old stats due > > to losing the drop (and create) slot messages. > > > > Yeah, something like this could happen. Another possibility here could > be that before the stats collector has processed drop and create > messages, we have enquired about the stats which lead to it giving us > the old stats. Note, that we don't wait for 'drop' or 'create' message > to be delivered. So, there is a possibility of the same. What do you > think? Yeah, that could happen even if any message didn't get dropped. > > > For the former case, it > > seems to better to create the slot just before the insertion and > > setting logical_decoding_work_mem to the default (64MB). For the > > latter case, maybe we can use a different name slot than the name used > > in other tests? > > > > How about doing both of the above suggestions? Alternatively, we can > wait for both 'drop' and 'create' message to be delivered but that > might be overkill. Agreed. Attached the patch doing both things. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ fix_stats_test.patch Description: Binary data
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada wrote: > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila wrote: > > > > > I am not sure if any of these alternatives are a good idea. What do > > you think? Do you have any other ideas for this? > > I've been considering some ideas but don't come up with a good one > yet. It’s just an idea and not tested but how about having > CreateDecodingContext() register before_shmem_exit() callback with the > decoding context to ensure that we send slot stats even on > interruption. And FreeDecodingContext() cancels the callback. > Is it a good idea to send stats while exiting and rely on the same? I think before_shmem_exit is mostly used for the cleanup purpose so not sure if we can rely on it for this purpose. I think we can't be sure that in all cases we will send all stats, so maybe Vignesh's patch is sufficient to cover the cases where we avoid losing it in cases where we would have sent a large amount of data. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 8:50 AM Tom Lane wrote: > > Amit Kapila writes: > > This is the first test and inserts just one small record, so how it > > can lead to spill of data. Do you mean to say that may be some > > background process has written some transaction which leads to a spill > > of data? > > autovacuum, say? > > > Yeah, something like this could happen. Another possibility here could > > be that before the stats collector has processed drop and create > > messages, we have enquired about the stats which lead to it giving us > > the old stats. Note, that we don't wait for 'drop' or 'create' message > > to be delivered. So, there is a possibility of the same. What do you > > think? > > You should take a close look at the stats test in the main regression > tests. We had to jump through *high* hoops to get that to be stable, > and yet it still fails semi-regularly. This looks like pretty much the > same thing, and so I'm pessimistically inclined to guess that it will > never be entirely stable. > True, it is possible that we can't make it entirely stable but I would like to try some more before giving up on this. Otherwise, I guess the other possibility is to remove some of the latest tests added or probably change them to be more forgiving. For example, we can change the currently failing test to not check 'spill*' count and rely on just 'total*' count which will work even in scenarios we discussed for this failure but it will reduce the efficiency/completeness of the test case. > (At least not before the fabled stats collector rewrite, which may well > introduce some entirely new set of failure modes.) > > Do we really need this test in this form? Perhaps it could be converted > to a TAP test that's a bit more forgiving. > We have a TAP test for slot stats but there we are checking some scenarios across the restart. We can surely move these tests also there but it is not apparent to me how it can create a difference? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On 2021-04-28 23:20:00 -0400, Tom Lane wrote: > (At least not before the fabled stats collector rewrite, which may well > introduce some entirely new set of failure modes.) FWIW, I added a function that forces a flush there. That can be done synchronously and the underlying functionality needs to exist anyway to deal with backend exit. Makes it a *lot* easier to write tests for stats related things...
Re: Replication slot stats misgivings
Amit Kapila writes: > This is the first test and inserts just one small record, so how it > can lead to spill of data. Do you mean to say that may be some > background process has written some transaction which leads to a spill > of data? autovacuum, say? > Yeah, something like this could happen. Another possibility here could > be that before the stats collector has processed drop and create > messages, we have enquired about the stats which lead to it giving us > the old stats. Note, that we don't wait for 'drop' or 'create' message > to be delivered. So, there is a possibility of the same. What do you > think? You should take a close look at the stats test in the main regression tests. We had to jump through *high* hoops to get that to be stable, and yet it still fails semi-regularly. This looks like pretty much the same thing, and so I'm pessimistically inclined to guess that it will never be entirely stable. (At least not before the fabled stats collector rewrite, which may well introduce some entirely new set of failure modes.) Do we really need this test in this form? Perhaps it could be converted to a TAP test that's a bit more forgiving. regards, tom lane
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 4:58 AM Masahiko Sawada wrote: > > On Thu, Apr 29, 2021 at 5:41 AM Tom Lane wrote: > > > > It seems that the test case added by f5fc2f5b2 is still a bit > > unstable, even after c64dcc7fe: > > Hmm, I don't see the exact cause yet but there are two possibilities: > some transactions were really spilled, > This is the first test and inserts just one small record, so how it can lead to spill of data. Do you mean to say that may be some background process has written some transaction which leads to a spill of data? > and it showed the old stats due > to losing the drop (and create) slot messages. > Yeah, something like this could happen. Another possibility here could be that before the stats collector has processed drop and create messages, we have enquired about the stats which lead to it giving us the old stats. Note, that we don't wait for 'drop' or 'create' message to be delivered. So, there is a possibility of the same. What do you think? > For the former case, it > seems to better to create the slot just before the insertion and > setting logical_decoding_work_mem to the default (64MB). For the > latter case, maybe we can use a different name slot than the name used > in other tests? > How about doing both of the above suggestions? Alternatively, we can wait for both 'drop' and 'create' message to be delivered but that might be overkill. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, Apr 29, 2021 at 5:41 AM Tom Lane wrote: > > It seems that the test case added by f5fc2f5b2 is still a bit > unstable, even after c64dcc7fe: Hmm, I don't see the exact cause yet but there are two possibilities: some transactions were really spilled, and it showed the old stats due to losing the drop (and create) slot messages. For the former case, it seems to better to create the slot just before the insertion and setting logical_decoding_work_mem to the default (64MB). For the latter case, maybe we can use a different name slot than the name used in other tests? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
It seems that the test case added by f5fc2f5b2 is still a bit unstable, even after c64dcc7fe: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus=2021-04-23%2006%3A20%3A12 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus=2021-04-24%2018%3A20%3A10 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snapper=2021-04-28%2017%3A53%3A14 (The snapper run fails to show regression.diffs, so it's not certain that it's the same failure as peripatus, but ...) regards, tom lane
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila wrote: > > On Wed, Apr 28, 2021 at 9:37 AM Masahiko Sawada wrote: > > > > On Wed, Apr 28, 2021 at 12:29 PM Amit Kapila > > wrote: > > > > > > On Tue, Apr 27, 2021 at 11:02 AM vignesh C wrote: > > > > > > > > On Tue, Apr 27, 2021 at 9:48 AM vignesh C wrote: > > > > > > > > > > > > > Attached patch has the changes to update statistics during > > > > spill/stream which prevents the statistics from being lost during > > > > interrupt. > > > > > > > > > > void > > > -UpdateDecodingStats(LogicalDecodingContext *ctx) > > > +UpdateDecodingStats(ReorderBuffer *rb) > > > > > > I don't think you need to change this interface because > > > reorderbuffer->private_data points to LogicalDecodingContext. See > > > StartupDecodingContext. > > > > +1 > > > > With this approach, we could still miss the totalTxns and totalBytes > > updates if the decoding a large but less than > > logical_decoding_work_mem is interrupted, right? > > > > Right, but is there some simple way to avoid that? I see two > possibilities (a) store stats in ReplicationSlot and then send them at > ReplicationSlotRelease but that will lead to an increase in shared > memory usage and as per the discussion above, we don't want that, (b) > send intermediate stats after decoding say N changes but for that, we > need to additionally compute the size of each change which might add > some overhead. Right. > I am not sure if any of these alternatives are a good idea. What do > you think? Do you have any other ideas for this? I've been considering some ideas but don't come up with a good one yet. It’s just an idea and not tested but how about having CreateDecodingContext() register before_shmem_exit() callback with the decoding context to ensure that we send slot stats even on interruption. And FreeDecodingContext() cancels the callback. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 4:51 PM Masahiko Sawada wrote: > > On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila wrote: > > > > > I think we can fix it by keeping track of total_size in toptxn as we > > are doing for the streaming case in ReorderBufferChangeMemoryUpdate. > > We can probably do it for non-streaming cases as well. > > Agreed. > > I've updated the patch. What do you think? > @@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, ReorderBufferIterTXNState *state) * Update the total bytes processed before releasing the current set * of changes and restoring the new set of changes. */ - rb->totalBytes += rb->size; + rb->totalBytes += entry->txn->total_size; if (ReorderBufferRestoreChanges(rb, entry->txn, >file, >entries[off].segno)) I have not tested this but won't in the above change you need to check txn->toptxn for subtxns? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 6:39 PM Amit Kapila wrote: > > On Wed, Apr 28, 2021 at 12:49 PM Masahiko Sawada > wrote: > > > > > > BTW regarding the commit f5fc2f5b23 that added total_txns and > > total_bytes, we add the reorder buffer size (i.g., rb->size) to > > rb->totalBytes but I think we should use the transaction size (i.g., > > txn->size) instead: > > > > You are right about the problem but I think your proposed fix also > won't work because txn->size always has current transaction size which > will be top-transaction in the case when a transaction has multiple > subtransactions. It won't include the subtxn->size. Right. I missed the point that ReorderBufferProcessTXN() processes also subtransactions. > I think we can fix it by keeping track of total_size in toptxn as we > are doing for the streaming case in ReorderBufferChangeMemoryUpdate. > We can probably do it for non-streaming cases as well. Agreed. I've updated the patch. What do you think? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ use_total_size_v2.patch Description: Binary data
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 12:49 PM Masahiko Sawada wrote: > > > BTW regarding the commit f5fc2f5b23 that added total_txns and > total_bytes, we add the reorder buffer size (i.g., rb->size) to > rb->totalBytes but I think we should use the transaction size (i.g., > txn->size) instead: > You are right about the problem but I think your proposed fix also won't work because txn->size always has current transaction size which will be top-transaction in the case when a transaction has multiple subtransactions. It won't include the subtxn->size. For example, you can try to decode with below kind of transaction: Begin; insert into t1 values(1); savepoint s1; insert into t1 values(2); savepoint s2; insert into t1 values(3); commit; I think we can fix it by keeping track of total_size in toptxn as we are doing for the streaming case in ReorderBufferChangeMemoryUpdate. We can probably do it for non-streaming cases as well. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila wrote: > > On Thu, Apr 15, 2021 at 4:35 PM Masahiko Sawada wrote: > > > > Thank you for the update! The patch looks good to me. > > BTW regarding the commit f5fc2f5b23 that added total_txns and total_bytes, we add the reorder buffer size (i.g., rb->size) to rb->totalBytes but I think we should use the transaction size (i.g., txn->size) instead: @@ -1363,6 +1365,11 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, ReorderBufferIterTXNState *state) dlist_delete(>node); dlist_push_tail(>old_change, >node); + /* +* Update the total bytes processed before releasing the current set +* of changes and restoring the new set of changes. +*/ + rb->totalBytes += rb->size; if (ReorderBufferRestoreChanges(rb, entry->txn, >file, >entries[off].segno)) { @@ -2363,6 +2370,20 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, ReorderBufferIterTXNFinish(rb, iterstate); iterstate = NULL; + /* +* Update total transaction count and total transaction bytes +* processed. Ensure to not count the streamed transaction multiple +* times. +* +* Note that the statistics computation has to be done after +* ReorderBufferIterTXNFinish as it releases the serialized change +* which we have already accounted in ReorderBufferIterTXNNext. +*/ + if (!rbtxn_is_streamed(txn)) + rb->totalTxns++; + + rb->totalBytes += rb->size; + IIUC rb->size could include multiple decoded transactions. So it's not appropriate to add that value to the counter as the transaction size passed to the logical decoding plugin. If the reorder buffer process a transaction while having a large transaction that is being decoded, we could end up more increasing txn_bytes than necessary. Please review the attached patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index c27f710053..30d9ffa0da 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -1369,7 +1369,7 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, ReorderBufferIterTXNState *state) * Update the total bytes processed before releasing the current set * of changes and restoring the new set of changes. */ - rb->totalBytes += rb->size; + rb->totalBytes += entry->txn->size; if (ReorderBufferRestoreChanges(rb, entry->txn, >file, >entries[off].segno)) { @@ -2382,7 +2382,7 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, if (!rbtxn_is_streamed(txn)) rb->totalTxns++; - rb->totalBytes += rb->size; + rb->totalBytes += txn->size; /* * Done with current changes, send the last message for this set of
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 9:37 AM Masahiko Sawada wrote: > > On Wed, Apr 28, 2021 at 12:29 PM Amit Kapila wrote: > > > > On Tue, Apr 27, 2021 at 11:02 AM vignesh C wrote: > > > > > > On Tue, Apr 27, 2021 at 9:48 AM vignesh C wrote: > > > > > > > > > > Attached patch has the changes to update statistics during > > > spill/stream which prevents the statistics from being lost during > > > interrupt. > > > > > > > void > > -UpdateDecodingStats(LogicalDecodingContext *ctx) > > +UpdateDecodingStats(ReorderBuffer *rb) > > > > I don't think you need to change this interface because > > reorderbuffer->private_data points to LogicalDecodingContext. See > > StartupDecodingContext. > > +1 > > With this approach, we could still miss the totalTxns and totalBytes > updates if the decoding a large but less than > logical_decoding_work_mem is interrupted, right? > Right, but is there some simple way to avoid that? I see two possibilities (a) store stats in ReplicationSlot and then send them at ReplicationSlotRelease but that will lead to an increase in shared memory usage and as per the discussion above, we don't want that, (b) send intermediate stats after decoding say N changes but for that, we need to additionally compute the size of each change which might add some overhead. I am not sure if any of these alternatives are a good idea. What do you think? Do you have any other ideas for this? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 9:37 AM Masahiko Sawada wrote: > > On Wed, Apr 28, 2021 at 12:29 PM Amit Kapila wrote: > > > > On Tue, Apr 27, 2021 at 11:02 AM vignesh C wrote: > > > > > > On Tue, Apr 27, 2021 at 9:48 AM vignesh C wrote: > > > > > > > > > > Attached patch has the changes to update statistics during > > > spill/stream which prevents the statistics from being lost during > > > interrupt. > > > > > > > void > > -UpdateDecodingStats(LogicalDecodingContext *ctx) > > +UpdateDecodingStats(ReorderBuffer *rb) > > > > I don't think you need to change this interface because > > reorderbuffer->private_data points to LogicalDecodingContext. See > > StartupDecodingContext. > > +1 > > With this approach, we could still miss the totalTxns and totalBytes > updates if the decoding a large but less than > logical_decoding_work_mem is interrupted, right? Yes you are right, I felt that is reasonable and that way it reduces frequent calls to the stats collector to update the stats. Regards, Vignesh
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 12:29 PM Amit Kapila wrote: > > On Tue, Apr 27, 2021 at 11:02 AM vignesh C wrote: > > > > On Tue, Apr 27, 2021 at 9:48 AM vignesh C wrote: > > > > > > > Attached patch has the changes to update statistics during > > spill/stream which prevents the statistics from being lost during > > interrupt. > > > > void > -UpdateDecodingStats(LogicalDecodingContext *ctx) > +UpdateDecodingStats(ReorderBuffer *rb) > > I don't think you need to change this interface because > reorderbuffer->private_data points to LogicalDecodingContext. See > StartupDecodingContext. +1 With this approach, we could still miss the totalTxns and totalBytes updates if the decoding a large but less than logical_decoding_work_mem is interrupted, right? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 8:59 AM Amit Kapila wrote: > > On Tue, Apr 27, 2021 at 11:02 AM vignesh C wrote: > > > > On Tue, Apr 27, 2021 at 9:48 AM vignesh C wrote: > > > > > > > Attached patch has the changes to update statistics during > > spill/stream which prevents the statistics from being lost during > > interrupt. > > > > void > -UpdateDecodingStats(LogicalDecodingContext *ctx) > +UpdateDecodingStats(ReorderBuffer *rb) > > I don't think you need to change this interface because > reorderbuffer->private_data points to LogicalDecodingContext. See > StartupDecodingContext. Other than that there is a comment in the code > "Update the decoding stats at transaction prepare/commit/abort...". > This patch should extend that comment by saying something like > "Additionally we send the stats when we spill or stream the changes to > avoid losing them in case the decoding is interrupted." Thanks for the comments, Please find the attached v4 patch having the fixes for the same. Regards, Vignesh From 533c45c4cbd4545350d464bbf7a2df91ee668a75 Mon Sep 17 00:00:00 2001 From: vignesh Date: Tue, 27 Apr 2021 10:56:02 +0530 Subject: [PATCH v4] Update replication statistics after every stream/spill. Currently, replication slot statistics are updated at prepare, commit, and rollback. Now, if the transaction is interrupted the stats might not get updated. Fixed this by updating replication statistics after every stream/spill. --- src/backend/replication/logical/decode.c| 14 -- src/backend/replication/logical/reorderbuffer.c | 6 ++ 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c index 7924581cdc..888e064ec0 100644 --- a/src/backend/replication/logical/decode.c +++ b/src/backend/replication/logical/decode.c @@ -746,9 +746,10 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf, } /* - * Update the decoding stats at transaction prepare/commit/abort. It is - * not clear that sending more or less frequently than this would be - * better. + * Update the decoding stats at transaction prepare/commit/abort. + * Additionally we send the stats when we spill or stream the changes to + * avoid losing them in case the decoding is interrupted. It is not clear + * that sending more or less frequently than this would be better. */ UpdateDecodingStats(ctx); } @@ -828,9 +829,10 @@ DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf, ReorderBufferPrepare(ctx->reorder, xid, parsed->twophase_gid); /* - * Update the decoding stats at transaction prepare/commit/abort. It is - * not clear that sending more or less frequently than this would be - * better. + * Update the decoding stats at transaction prepare/commit/abort. + * Additionally we send the stats when we spill or stream the changes to + * avoid losing them in case the decoding is interrupted. It is not clear + * that sending more or less frequently than this would be better. */ UpdateDecodingStats(ctx); } diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index c27f710053..ceb83bcbf9 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -3551,6 +3551,9 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb, ReorderBufferTXN *txn) /* don't consider already serialized transactions */ rb->spillTxns += (rbtxn_is_serialized(txn) || rbtxn_is_serialized_clear(txn)) ? 0 : 1; + + /* update the decoding stats */ + UpdateDecodingStats(rb->private_data); } Assert(spilled == txn->nentries_mem); @@ -3920,6 +3923,9 @@ ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn) /* Don't consider already streamed transaction. */ rb->streamTxns += (txn_is_streamed) ? 0 : 1; + /* update the decoding stats */ + UpdateDecodingStats(rb->private_data); + Assert(dlist_is_empty(>changes)); Assert(txn->nentries == 0); Assert(txn->nentries_mem == 0); -- 2.25.1
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 11:02 AM vignesh C wrote: > > On Tue, Apr 27, 2021 at 9:48 AM vignesh C wrote: > > > > Attached patch has the changes to update statistics during > spill/stream which prevents the statistics from being lost during > interrupt. > void -UpdateDecodingStats(LogicalDecodingContext *ctx) +UpdateDecodingStats(ReorderBuffer *rb) I don't think you need to change this interface because reorderbuffer->private_data points to LogicalDecodingContext. See StartupDecodingContext. Other than that there is a comment in the code "Update the decoding stats at transaction prepare/commit/abort...". This patch should extend that comment by saying something like "Additionally we send the stats when we spill or stream the changes to avoid losing them in case the decoding is interrupted." -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, Apr 28, 2021 at 8:28 AM Amit Kapila wrote: > > On Tue, Apr 27, 2021 at 8:28 PM Masahiko Sawada wrote: > > > > On Tue, Apr 27, 2021 at 11:29 PM Amit Kapila > > wrote: > > > > > > On Tue, Apr 27, 2021 at 5:40 PM Amit Kapila > > > wrote: > > > > > > > > > > I am not sure if the timeout happened because the machine is slow or > > > is it in any way related to code. I am seeing some previous failures > > > due to timeout on this machine [1][2]. In those failures, I see the > > > "using stale stats" message. > > > > It seems like a time-dependent issue but I'm wondering why the logical > > decoding test failed at this time. > > > > As per the analysis done till now, it appears to be due to the reason > that the machine is slow which leads to timeout and there appear to be > some prior failures related to timeout as well. I think it is better > to wait for another run (or few runs) to see if this occurs again. > Yes, checkpoint seems to take a lot of time, could be because the machine is slow. Let's wait for the next run and see. Regards, Vignesh
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 8:28 PM Masahiko Sawada wrote: > > On Tue, Apr 27, 2021 at 11:29 PM Amit Kapila wrote: > > > > On Tue, Apr 27, 2021 at 5:40 PM Amit Kapila wrote: > > > > > > > I am not sure if the timeout happened because the machine is slow or > > is it in any way related to code. I am seeing some previous failures > > due to timeout on this machine [1][2]. In those failures, I see the > > "using stale stats" message. > > It seems like a time-dependent issue but I'm wondering why the logical > decoding test failed at this time. > As per the analysis done till now, it appears to be due to the reason that the machine is slow which leads to timeout and there appear to be some prior failures related to timeout as well. I think it is better to wait for another run (or few runs) to see if this occurs again. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 11:29 PM Amit Kapila wrote: > > On Tue, Apr 27, 2021 at 5:40 PM Amit Kapila wrote: > > > > On Tue, Apr 27, 2021 at 8:58 AM Masahiko Sawada > > wrote: > > > > I have pushed this patch and seeing one buildfarm failure: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill=2021-04-27%2009%3A23%3A14 > > > > starting permutation: s1_init s1_begin s1_insert_tbl1 s1_insert_tbl2 > > s2_alter_tbl1_char s1_commit s2_get_changes > > + isolationtester: canceling step s1_init after 314 seconds > > step s1_init: SELECT 'init' FROM > > pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); > > ?column? > > > > I am analyzing this. > > > > After checking below logs corresponding to this test, it seems test > has been executed and create_slot was successful: The pg_create_logical_replication_slot() was executed at 11:04:25: 2021-04-27 11:04:25.494 UTC [17694956:49] isolation/concurrent_ddl_dml LOG: statement: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); Therefore this command took 314 sec that matches the number the isolation test reported. And the folling logs follow: 2021-04-27 11:06:43.770 UTC [17694956:50] isolation/concurrent_ddl_dml LOG: logical decoding found consistent point at 0/17F9078 2021-04-27 11:06:43.770 UTC [17694956:51] isolation/concurrent_ddl_dml DETAIL: There are no running transactions. > 2021-04-27 11:06:43.770 UTC [17694956:52] isolation/concurrent_ddl_dml > STATEMENT: SELECT 'init' FROM > pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); > 2021-04-27 11:07:11.748 UTC [5243096:9] LOG: checkpoint starting: time > 2021-04-27 11:09:24.332 UTC [5243096:10] LOG: checkpoint complete: > wrote 14 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; > write=0.716 s, sync=0.001 s, total=132.584 s; sync files=0, > longest=0.000 s, average=0.000 s; distance=198 kB, estimate=406 kB > 2021-04-27 11:09:40.116 UTC [6226046:1] [unknown] LOG: connection > received: host=[local] > 2021-04-27 11:09:40.117 UTC [17694956:53] isolation/concurrent_ddl_dml > LOG: statement: BEGIN; > 2021-04-27 11:09:40.117 UTC [17694956:54] isolation/concurrent_ddl_dml > LOG: statement: INSERT INTO tbl1 (val1, val2) VALUES (1, 1); > 2021-04-27 11:09:40.118 UTC [17694956:55] isolation/concurrent_ddl_dml > LOG: statement: INSERT INTO tbl2 (val1, val2) VALUES (1, 1); > 2021-04-27 11:09:40.119 UTC [10944636:49] isolation/concurrent_ddl_dml > LOG: statement: ALTER TABLE tbl1 ALTER COLUMN val2 TYPE character > varying; > > I am not sure but there is some possibility that even though create > slot is successful, the isolation tester got successful in canceling > it, maybe because create_slot is just finished at the same time. Yeah, we see the test log "canceling step s1_init after 314 seconds" but don't see any log indicating canceling query. > As we > can see from logs, during this test checkpoint also happened which > could also lead to the slowness of this particular command. Yes. I also think the checkpoint could somewhat lead to the slowness. And since create_slot() took 2min to find a consistent snapshot the system might have already been busy. > > Also, I see a lot of messages like below which indicate stats > collector is also quite slow: > 2021-04-27 10:57:59.385 UTC [18743536:1] LOG: using stale statistics > instead of current ones because stats collector is not responding > > I am not sure if the timeout happened because the machine is slow or > is it in any way related to code. I am seeing some previous failures > due to timeout on this machine [1][2]. In those failures, I see the > "using stale stats" message. It seems like a time-dependent issue but I'm wondering why the logical decoding test failed at this time. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 5:40 PM Amit Kapila wrote: > > On Tue, Apr 27, 2021 at 8:58 AM Masahiko Sawada wrote: > > I have pushed this patch and seeing one buildfarm failure: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill=2021-04-27%2009%3A23%3A14 > > starting permutation: s1_init s1_begin s1_insert_tbl1 s1_insert_tbl2 > s2_alter_tbl1_char s1_commit s2_get_changes > + isolationtester: canceling step s1_init after 314 seconds > step s1_init: SELECT 'init' FROM > pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); > ?column? > > I am analyzing this. > After checking below logs corresponding to this test, it seems test has been executed and create_slot was successful: 2021-04-27 11:06:43.770 UTC [17694956:52] isolation/concurrent_ddl_dml STATEMENT: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); 2021-04-27 11:07:11.748 UTC [5243096:9] LOG: checkpoint starting: time 2021-04-27 11:09:24.332 UTC [5243096:10] LOG: checkpoint complete: wrote 14 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.716 s, sync=0.001 s, total=132.584 s; sync files=0, longest=0.000 s, average=0.000 s; distance=198 kB, estimate=406 kB 2021-04-27 11:09:40.116 UTC [6226046:1] [unknown] LOG: connection received: host=[local] 2021-04-27 11:09:40.117 UTC [17694956:53] isolation/concurrent_ddl_dml LOG: statement: BEGIN; 2021-04-27 11:09:40.117 UTC [17694956:54] isolation/concurrent_ddl_dml LOG: statement: INSERT INTO tbl1 (val1, val2) VALUES (1, 1); 2021-04-27 11:09:40.118 UTC [17694956:55] isolation/concurrent_ddl_dml LOG: statement: INSERT INTO tbl2 (val1, val2) VALUES (1, 1); 2021-04-27 11:09:40.119 UTC [10944636:49] isolation/concurrent_ddl_dml LOG: statement: ALTER TABLE tbl1 ALTER COLUMN val2 TYPE character varying; I am not sure but there is some possibility that even though create slot is successful, the isolation tester got successful in canceling it, maybe because create_slot is just finished at the same time. As we can see from logs, during this test checkpoint also happened which could also lead to the slowness of this particular command. Also, I see a lot of messages like below which indicate stats collector is also quite slow: 2021-04-27 10:57:59.385 UTC [18743536:1] LOG: using stale statistics instead of current ones because stats collector is not responding I am not sure if the timeout happened because the machine is slow or is it in any way related to code. I am seeing some previous failures due to timeout on this machine [1][2]. In those failures, I see the "using stale stats" message. Also, I am not able to see why it can fail due to this patch? [1] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill=2021-02-23%2004%3A23%3A56 [2] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill=2020-12-24%2005%3A31%3A43 -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 8:58 AM Masahiko Sawada wrote: > > On Tue, Apr 27, 2021 at 11:45 AM Amit Kapila wrote: > > > > > > Sawada-San, I would like to go ahead with your > > "Use-HTAB-for-replication-slot-statistics" unless you think otherwise? > > I agree that it's better to use the stats collector. So please go ahead. > I have pushed this patch and seeing one buildfarm failure: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill=2021-04-27%2009%3A23%3A14 starting permutation: s1_init s1_begin s1_insert_tbl1 s1_insert_tbl2 s2_alter_tbl1_char s1_commit s2_get_changes + isolationtester: canceling step s1_init after 314 seconds step s1_init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); ?column? I am analyzing this. Do let me know if you have any thoughts on the same? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 9:48 AM vignesh C wrote: > > On Tue, Apr 27, 2021 at 9:43 AM Amit Kapila wrote: > > > > On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada > > wrote: > > > > > > On Tue, Apr 27, 2021 at 11:31 AM vignesh C wrote: > > > > > > > > > > And I think there is > > > > > > also a risk to increase shared memory when we want to add other > > > > > > statistics in the future. > > > > > > > > > > > > > > > > Yeah, so do you think it is not a good idea to store stats in > > > > > ReplicationSlot? Actually storing them in a slot makes it easier to > > > > > send them during ReplicationSlotRelease which is quite helpful if the > > > > > replication is interrupted due to some reason. Or the other idea was > > > > > that we send stats every time we stream or spill changes. > > > > > > > > We use around 64 bytes of shared memory to store the statistics > > > > information per slot, I'm not sure if this is a lot of memory. If this > > > > memory is fine, then I felt the approach to store stats seems fine. If > > > > that memory is too much then we could use the other approach to update > > > > stats when we stream or spill the changes as suggested by Amit. > > > > > > I agree that makes it easier to send slot stats during > > > ReplicationSlotRelease() but I'd prefer to avoid storing data that > > > doesn't need to be shared in the shared buffer if possible. > > > > > > > Sounds reasonable and we might add some stats in the future so that > > will further increase the usage of shared memory. > > > > > And those > > > counters are not used by physical slots at all. If sending slot stats > > > every time we stream or spill changes doesn't affect the system much, > > > I think it's better than having slot stats in the shared memory. > > > > > > > As the minimum size of logical_decoding_work_mem is 64KB, so in the > > worst case, we will send stats after decoding that many changes. I > > don't think it would impact too much considering that we need to spill > > or stream those many changes. If it concerns any users they can > > always increase logical_decoding_work_mem. The default value is 64MB > > at which point, I don't think it will matter sending the stats. > > Sounds good to me, I will rebase my previous patch and send a patch for this. > Attached patch has the changes to update statistics during spill/stream which prevents the statistics from being lost during interrupt. Thoughts? Regards, Vignesh From a1fd6f76c955482efa7e63ea6c25ae6350160b4d Mon Sep 17 00:00:00 2001 From: vignesh Date: Tue, 27 Apr 2021 10:56:02 +0530 Subject: [PATCH v3] Update replication statistics after every stream/spill. Currently, replication slot statistics are updated at prepare, commit, and rollback. Now, if the transaction is interrupted the stats might not get updated. Fixed this by updating replication statistics after every stream/spill. --- src/backend/replication/logical/decode.c| 6 +++--- src/backend/replication/logical/logical.c | 6 +++--- src/backend/replication/logical/reorderbuffer.c | 3 +++ src/include/replication/logical.h | 2 +- 4 files changed, 10 insertions(+), 7 deletions(-) diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c index 7924581cdc..2c009d51ec 100644 --- a/src/backend/replication/logical/decode.c +++ b/src/backend/replication/logical/decode.c @@ -750,7 +750,7 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf, * not clear that sending more or less frequently than this would be * better. */ - UpdateDecodingStats(ctx); + UpdateDecodingStats(ctx->reorder); } /* @@ -832,7 +832,7 @@ DecodePrepare(LogicalDecodingContext *ctx, XLogRecordBuffer *buf, * not clear that sending more or less frequently than this would be * better. */ - UpdateDecodingStats(ctx); + UpdateDecodingStats(ctx->reorder); } @@ -889,7 +889,7 @@ DecodeAbort(LogicalDecodingContext *ctx, XLogRecordBuffer *buf, } /* update the decoding stats */ - UpdateDecodingStats(ctx); + UpdateDecodingStats(ctx->reorder); } /* diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c index 00543ede45..0c3ef0f93f 100644 --- a/src/backend/replication/logical/logical.c +++ b/src/backend/replication/logical/logical.c @@ -1770,10 +1770,10 @@ ResetLogicalStreamingState(void) * Report stats for a slot. */ void -UpdateDecodingStats(LogicalDecodingContext *ctx) +UpdateDecodingStats(ReorderBuffer *rb) { - ReorderBuffer *rb = ctx->reorder; PgStat_StatReplSlotEntry repSlotStat; + ReplicationSlot *slot = MyReplicationSlot; /* Nothing to do if we don't have any replication stats to be sent. */ if (rb->spillBytes <= 0 && rb->streamBytes <= 0 && rb->totalBytes <= 0) @@ -1790,7 +1790,7 @@ UpdateDecodingStats(LogicalDecodingContext *ctx) (long long) rb->totalTxns, (long long) rb->totalBytes); - namestrcpy(, NameStr(ctx->slot->data.name)); + namestrcpy(,
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 1:18 PM vignesh C wrote: > > On Tue, Apr 27, 2021 at 9:43 AM Amit Kapila wrote: > > > > On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada > > wrote: > > > > > > On Tue, Apr 27, 2021 at 11:31 AM vignesh C wrote: > > > > > > > > > > And I think there is > > > > > > also a risk to increase shared memory when we want to add other > > > > > > statistics in the future. > > > > > > > > > > > > > > > > Yeah, so do you think it is not a good idea to store stats in > > > > > ReplicationSlot? Actually storing them in a slot makes it easier to > > > > > send them during ReplicationSlotRelease which is quite helpful if the > > > > > replication is interrupted due to some reason. Or the other idea was > > > > > that we send stats every time we stream or spill changes. > > > > > > > > We use around 64 bytes of shared memory to store the statistics > > > > information per slot, I'm not sure if this is a lot of memory. If this > > > > memory is fine, then I felt the approach to store stats seems fine. If > > > > that memory is too much then we could use the other approach to update > > > > stats when we stream or spill the changes as suggested by Amit. > > > > > > I agree that makes it easier to send slot stats during > > > ReplicationSlotRelease() but I'd prefer to avoid storing data that > > > doesn't need to be shared in the shared buffer if possible. > > > > > > > Sounds reasonable and we might add some stats in the future so that > > will further increase the usage of shared memory. > > > > > And those > > > counters are not used by physical slots at all. If sending slot stats > > > every time we stream or spill changes doesn't affect the system much, > > > I think it's better than having slot stats in the shared memory. > > > > > > > As the minimum size of logical_decoding_work_mem is 64KB, so in the > > worst case, we will send stats after decoding that many changes. I > > don't think it would impact too much considering that we need to spill > > or stream those many changes. If it concerns any users they can > > always increase logical_decoding_work_mem. The default value is 64MB > > at which point, I don't think it will matter sending the stats. > > Sounds good to me, I will rebase my previous patch and send a patch for this. +1. Thanks! Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 9:43 AM Amit Kapila wrote: > > On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada wrote: > > > > On Tue, Apr 27, 2021 at 11:31 AM vignesh C wrote: > > > > > > > > And I think there is > > > > > also a risk to increase shared memory when we want to add other > > > > > statistics in the future. > > > > > > > > > > > > > Yeah, so do you think it is not a good idea to store stats in > > > > ReplicationSlot? Actually storing them in a slot makes it easier to > > > > send them during ReplicationSlotRelease which is quite helpful if the > > > > replication is interrupted due to some reason. Or the other idea was > > > > that we send stats every time we stream or spill changes. > > > > > > We use around 64 bytes of shared memory to store the statistics > > > information per slot, I'm not sure if this is a lot of memory. If this > > > memory is fine, then I felt the approach to store stats seems fine. If > > > that memory is too much then we could use the other approach to update > > > stats when we stream or spill the changes as suggested by Amit. > > > > I agree that makes it easier to send slot stats during > > ReplicationSlotRelease() but I'd prefer to avoid storing data that > > doesn't need to be shared in the shared buffer if possible. > > > > Sounds reasonable and we might add some stats in the future so that > will further increase the usage of shared memory. > > > And those > > counters are not used by physical slots at all. If sending slot stats > > every time we stream or spill changes doesn't affect the system much, > > I think it's better than having slot stats in the shared memory. > > > > As the minimum size of logical_decoding_work_mem is 64KB, so in the > worst case, we will send stats after decoding that many changes. I > don't think it would impact too much considering that we need to spill > or stream those many changes. If it concerns any users they can > always increase logical_decoding_work_mem. The default value is 64MB > at which point, I don't think it will matter sending the stats. Sounds good to me, I will rebase my previous patch and send a patch for this. Regards, Vignesh
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 9:17 AM Masahiko Sawada wrote: > > On Tue, Apr 27, 2021 at 11:31 AM vignesh C wrote: > > > > > > And I think there is > > > > also a risk to increase shared memory when we want to add other > > > > statistics in the future. > > > > > > > > > > Yeah, so do you think it is not a good idea to store stats in > > > ReplicationSlot? Actually storing them in a slot makes it easier to > > > send them during ReplicationSlotRelease which is quite helpful if the > > > replication is interrupted due to some reason. Or the other idea was > > > that we send stats every time we stream or spill changes. > > > > We use around 64 bytes of shared memory to store the statistics > > information per slot, I'm not sure if this is a lot of memory. If this > > memory is fine, then I felt the approach to store stats seems fine. If > > that memory is too much then we could use the other approach to update > > stats when we stream or spill the changes as suggested by Amit. > > I agree that makes it easier to send slot stats during > ReplicationSlotRelease() but I'd prefer to avoid storing data that > doesn't need to be shared in the shared buffer if possible. > Sounds reasonable and we might add some stats in the future so that will further increase the usage of shared memory. > And those > counters are not used by physical slots at all. If sending slot stats > every time we stream or spill changes doesn't affect the system much, > I think it's better than having slot stats in the shared memory. > As the minimum size of logical_decoding_work_mem is 64KB, so in the worst case, we will send stats after decoding that many changes. I don't think it would impact too much considering that we need to spill or stream those many changes. If it concerns any users they can always increase logical_decoding_work_mem. The default value is 64MB at which point, I don't think it will matter sending the stats. > Also, not sure it’s better but another idea would be to make the slot > stats a global variable like pgBufferUsage and use it during decoding. > Hmm, I think it is better to avoid global variables if possible. > Or we can set a proc-exit callback? But to be honest, I'm not sure > which approach we should go with. Those approaches have proc and cons. > I think we can try the first approach listed here which is to send stats each time we spill or stream. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 11:31 AM vignesh C wrote: > > On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila wrote: > > > > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada > > wrote: > > > > > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila > > > wrote: > > > > > > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C wrote: > > > > > > > > > > I have made the changes to update the replication statistics at > > > > > replication slot release. Please find the patch attached for the same. > > > > > Thoughts? > > > > > > > > > > > > > Thanks, the changes look mostly good to me. The slot stats need to be > > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in > > > > StartupDecodingContext. Apart from that, I have moved the declaration > > > > of UpdateDecodingStats from slot.h back to logical.h. I have also > > > > added/edited a few comments. Please check and let me know what do you > > > > think of the attached? > > > > > > The patch moves slot stats to the ReplicationSlot data that is on the > > > shared memory. If we have a space to store the statistics in the > > > shared memory can we simply accumulate the stats there and make them > > > persistent without using the stats collector? > > > > > > > But for that, we need to write to file at every commit/abort/prepare > > (decode of commit) which I think will incur significant overhead. > > Also, we try to write after few commits then there is a danger of > > losing them and still there could be a visible overhead for small > > transactions. > > > > I preferred not to persist this information to file, let's have stats > collector handle the stats persisting. > > > > And I think there is > > > also a risk to increase shared memory when we want to add other > > > statistics in the future. > > > > > > > Yeah, so do you think it is not a good idea to store stats in > > ReplicationSlot? Actually storing them in a slot makes it easier to > > send them during ReplicationSlotRelease which is quite helpful if the > > replication is interrupted due to some reason. Or the other idea was > > that we send stats every time we stream or spill changes. > > We use around 64 bytes of shared memory to store the statistics > information per slot, I'm not sure if this is a lot of memory. If this > memory is fine, then I felt the approach to store stats seems fine. If > that memory is too much then we could use the other approach to update > stats when we stream or spill the changes as suggested by Amit. I agree that makes it easier to send slot stats during ReplicationSlotRelease() but I'd prefer to avoid storing data that doesn't need to be shared in the shared buffer if possible. And those counters are not used by physical slots at all. If sending slot stats every time we stream or spill changes doesn't affect the system much, I think it's better than having slot stats in the shared memory. Also, not sure it’s better but another idea would be to make the slot stats a global variable like pgBufferUsage and use it during decoding. Or we can set a proc-exit callback? But to be honest, I'm not sure which approach we should go with. Those approaches have proc and cons. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 11:45 AM Amit Kapila wrote: > > On Tue, Apr 27, 2021 at 8:01 AM vignesh C wrote: > > > > On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila wrote: > > > > > > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada > > > wrote: > > > > > > > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila > > > > wrote: > > > > > > > > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C wrote: > > > > > > > > > > > > I have made the changes to update the replication statistics at > > > > > > replication slot release. Please find the patch attached for the > > > > > > same. > > > > > > Thoughts? > > > > > > > > > > > > > > > > Thanks, the changes look mostly good to me. The slot stats need to be > > > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in > > > > > StartupDecodingContext. Apart from that, I have moved the declaration > > > > > of UpdateDecodingStats from slot.h back to logical.h. I have also > > > > > added/edited a few comments. Please check and let me know what do you > > > > > think of the attached? > > > > > > > > The patch moves slot stats to the ReplicationSlot data that is on the > > > > shared memory. If we have a space to store the statistics in the > > > > shared memory can we simply accumulate the stats there and make them > > > > persistent without using the stats collector? > > > > > > > > > > But for that, we need to write to file at every commit/abort/prepare > > > (decode of commit) which I think will incur significant overhead. > > > Also, we try to write after few commits then there is a danger of > > > losing them and still there could be a visible overhead for small > > > transactions. > > > > > > > I preferred not to persist this information to file, let's have stats > > collector handle the stats persisting. > > > > Sawada-San, I would like to go ahead with your > "Use-HTAB-for-replication-slot-statistics" unless you think otherwise? I agree that it's better to use the stats collector. So please go ahead. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Tue, Apr 27, 2021 at 8:01 AM vignesh C wrote: > > On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila wrote: > > > > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada > > wrote: > > > > > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila > > > wrote: > > > > > > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C wrote: > > > > > > > > > > I have made the changes to update the replication statistics at > > > > > replication slot release. Please find the patch attached for the same. > > > > > Thoughts? > > > > > > > > > > > > > Thanks, the changes look mostly good to me. The slot stats need to be > > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in > > > > StartupDecodingContext. Apart from that, I have moved the declaration > > > > of UpdateDecodingStats from slot.h back to logical.h. I have also > > > > added/edited a few comments. Please check and let me know what do you > > > > think of the attached? > > > > > > The patch moves slot stats to the ReplicationSlot data that is on the > > > shared memory. If we have a space to store the statistics in the > > > shared memory can we simply accumulate the stats there and make them > > > persistent without using the stats collector? > > > > > > > But for that, we need to write to file at every commit/abort/prepare > > (decode of commit) which I think will incur significant overhead. > > Also, we try to write after few commits then there is a danger of > > losing them and still there could be a visible overhead for small > > transactions. > > > > I preferred not to persist this information to file, let's have stats > collector handle the stats persisting. > Sawada-San, I would like to go ahead with your "Use-HTAB-for-replication-slot-statistics" unless you think otherwise? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Mon, Apr 26, 2021 at 8:42 AM Amit Kapila wrote: > > On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada wrote: > > > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila wrote: > > > > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C wrote: > > > > > > > > I have made the changes to update the replication statistics at > > > > replication slot release. Please find the patch attached for the same. > > > > Thoughts? > > > > > > > > > > Thanks, the changes look mostly good to me. The slot stats need to be > > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in > > > StartupDecodingContext. Apart from that, I have moved the declaration > > > of UpdateDecodingStats from slot.h back to logical.h. I have also > > > added/edited a few comments. Please check and let me know what do you > > > think of the attached? > > > > The patch moves slot stats to the ReplicationSlot data that is on the > > shared memory. If we have a space to store the statistics in the > > shared memory can we simply accumulate the stats there and make them > > persistent without using the stats collector? > > > > But for that, we need to write to file at every commit/abort/prepare > (decode of commit) which I think will incur significant overhead. > Also, we try to write after few commits then there is a danger of > losing them and still there could be a visible overhead for small > transactions. > I preferred not to persist this information to file, let's have stats collector handle the stats persisting. > > And I think there is > > also a risk to increase shared memory when we want to add other > > statistics in the future. > > > > Yeah, so do you think it is not a good idea to store stats in > ReplicationSlot? Actually storing them in a slot makes it easier to > send them during ReplicationSlotRelease which is quite helpful if the > replication is interrupted due to some reason. Or the other idea was > that we send stats every time we stream or spill changes. We use around 64 bytes of shared memory to store the statistics information per slot, I'm not sure if this is a lot of memory. If this memory is fine, then I felt the approach to store stats seems fine. If that memory is too much then we could use the other approach to update stats when we stream or spill the changes as suggested by Amit. Regards, Vignesh
Re: Replication slot stats misgivings
On Mon, Apr 26, 2021 at 8:01 AM Masahiko Sawada wrote: > > On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila wrote: > > > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C wrote: > > > > > > I have made the changes to update the replication statistics at > > > replication slot release. Please find the patch attached for the same. > > > Thoughts? > > > > > > > Thanks, the changes look mostly good to me. The slot stats need to be > > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in > > StartupDecodingContext. Apart from that, I have moved the declaration > > of UpdateDecodingStats from slot.h back to logical.h. I have also > > added/edited a few comments. Please check and let me know what do you > > think of the attached? > > The patch moves slot stats to the ReplicationSlot data that is on the > shared memory. If we have a space to store the statistics in the > shared memory can we simply accumulate the stats there and make them > persistent without using the stats collector? > But for that, we need to write to file at every commit/abort/prepare (decode of commit) which I think will incur significant overhead. Also, we try to write after few commits then there is a danger of losing them and still there could be a visible overhead for small transactions. > And I think there is > also a risk to increase shared memory when we want to add other > statistics in the future. > Yeah, so do you think it is not a good idea to store stats in ReplicationSlot? Actually storing them in a slot makes it easier to send them during ReplicationSlotRelease which is quite helpful if the replication is interrupted due to some reason. Or the other idea was that we send stats every time we stream or spill changes. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Fri, Apr 23, 2021 at 6:15 PM Amit Kapila wrote: > > On Mon, Apr 19, 2021 at 4:28 PM vignesh C wrote: > > > > I have made the changes to update the replication statistics at > > replication slot release. Please find the patch attached for the same. > > Thoughts? > > > > Thanks, the changes look mostly good to me. The slot stats need to be > initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in > StartupDecodingContext. Apart from that, I have moved the declaration > of UpdateDecodingStats from slot.h back to logical.h. I have also > added/edited a few comments. Please check and let me know what do you > think of the attached? The patch moves slot stats to the ReplicationSlot data that is on the shared memory. If we have a space to store the statistics in the shared memory can we simply accumulate the stats there and make them persistent without using the stats collector? And I think there is also a risk to increase shared memory when we want to add other statistics in the future. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Mon, Apr 19, 2021 at 4:28 PM vignesh C wrote: > > I have made the changes to update the replication statistics at > replication slot release. Please find the patch attached for the same. > Thoughts? > Thanks, the changes look mostly good to me. The slot stats need to be initialized in RestoreSlotFromDisk and ReplicationSlotCreate, not in StartupDecodingContext. Apart from that, I have moved the declaration of UpdateDecodingStats from slot.h back to logical.h. I have also added/edited a few comments. Please check and let me know what do you think of the attached? -- With Regards, Amit Kapila. v2-0001-Update-decoding-stats-during-replication-slot-rel.patch Description: Binary data
Re: Replication slot stats misgivings
On Thu, Apr 22, 2021 at 1:02 PM Masahiko Sawada wrote: > Thanks, it looks good to me now. I'll review/test some more before committing but at this stage, I would like to know from Andres or others whether they see any problem with this approach to fixing a few of the problems reported in this thread. Basically, it will fix the cases where the drop message is lost and we were not able to record stats for new slots and writing beyond the end of the array when after restarting the number of slots whose stats are stored in the stats file exceeds max_replication_slots. It uses HTAB instead of an array to record slot stats and also taught pgstat_vacuum_stat() to search for all the dead replication slots in stats hashtable and tell the collector to remove them. This still uses slot_name as the key because we were not able to find a better way to use slot's idx. Andres, unless you see any problems with this approach, I would like to move forward with this early next week? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, Apr 22, 2021 at 3:03 PM Amit Kapila wrote: > > On Thu, Apr 22, 2021 at 10:39 AM Masahiko Sawada > wrote: > > > > On Thu, Apr 22, 2021 at 1:50 PM Amit Kapila wrote: > > > > > > On Thu, Apr 22, 2021 at 8:26 AM Masahiko Sawada > > > wrote: > > > > > > > > > 2. > > > + if (replSlotStatHash != NULL) > > > + (void) hash_search(replSlotStatHash, > > > +(void *) &(msg->m_slotname), > > > +HASH_REMOVE, > > > +NULL); > > > > > > Why have you changed this part from using NameStr? > > > > I thought that since the hash table is created with the key size > > sizeof(NameData) it's better to use NameData for searching as well. > > > > Fair enough. I think this will give the same result either way. I've attached the updated version patch. Besides the review comment from Amit, I changed pgstat_read_statsfiles() so that it doesn't use spgstat_get_replslot_entry(). That’s because it was slightly unclear why we create the hash table beforehand even though we call pgstat_read_statsfiles() with ‘create’ = true. By this change, the caller of pgstat_get_replslot_entry() with ‘create’ = true is only pgstat_recv_replslot(), which makes the code clear and safe since pgstat_recv_replslot() is used only by the collector. Also, I ran pgindent on the modified files. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ v13-0001-Use-HTAB-for-replication-slot-statistics.patch Description: Binary data
Re: Replication slot stats misgivings
On Thu, Apr 22, 2021 at 10:39 AM Masahiko Sawada wrote: > > On Thu, Apr 22, 2021 at 1:50 PM Amit Kapila wrote: > > > > On Thu, Apr 22, 2021 at 8:26 AM Masahiko Sawada > > wrote: > > > > > > 2. > > + if (replSlotStatHash != NULL) > > + (void) hash_search(replSlotStatHash, > > +(void *) &(msg->m_slotname), > > +HASH_REMOVE, > > +NULL); > > > > Why have you changed this part from using NameStr? > > I thought that since the hash table is created with the key size > sizeof(NameData) it's better to use NameData for searching as well. > Fair enough. I think this will give the same result either way. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, Apr 22, 2021 at 1:50 PM Amit Kapila wrote: > > On Thu, Apr 22, 2021 at 8:26 AM Masahiko Sawada wrote: > > > > Few comments: > 1. > I think we want stats collector to not use pgStatLocalContext unless > it has read the stats file similar to other cases. So probably, we > should allocate it in pgStatLocalContext when we read 'R' message in > pgstat_read_statsfiles. Also, the function pgstat_get_replslot_entry > should not use pgStatLocalContext to allocate the hash table. Agreed. > 2. > + if (replSlotStatHash != NULL) > + (void) hash_search(replSlotStatHash, > +(void *) &(msg->m_slotname), > +HASH_REMOVE, > +NULL); > > Why have you changed this part from using NameStr? I thought that since the hash table is created with the key size sizeof(NameData) it's better to use NameData for searching as well. > 3. > +# Check that replicatoin slot stats are expected. > > Typo. replicatoin/replication Will fix in the next version. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Thu, Apr 22, 2021 at 8:26 AM Masahiko Sawada wrote: > Few comments: 1. I think we want stats collector to not use pgStatLocalContext unless it has read the stats file similar to other cases. So probably, we should allocate it in pgStatLocalContext when we read 'R' message in pgstat_read_statsfiles. Also, the function pgstat_get_replslot_entry should not use pgStatLocalContext to allocate the hash table. 2. + if (replSlotStatHash != NULL) + (void) hash_search(replSlotStatHash, +(void *) &(msg->m_slotname), +HASH_REMOVE, +NULL); Why have you changed this part from using NameStr? 3. +# Check that replicatoin slot stats are expected. Typo. replicatoin/replication -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Thu, Apr 22, 2021 at 7:52 AM Masahiko Sawada wrote: > > On Wed, Apr 21, 2021 at 4:44 PM Dilip Kumar wrote: > > > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada > > wrote: > > > > > > > > I've attached the patch. In addition to the test Vignesh prepared, I > > > added one test for the message for creating a slot that checks if the > > > statistics are initialized after re-creating the same name slot. > > > Please review it. > > > > Overall the patch looks good to me. However, I have one question, I > > did not understand the reason behind moving the below code from > > "pgstat_reset_replslot_counter" to "pg_stat_reset_replication_slot"? > > Andres pointed out that pgstat_reset_replslot_counter() acquires lwlock[1]: > > --- > - pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I > think pgstat.c has absolutely no business doing things on that level. > --- > > I changed the code so that pgstat_reset_replslot_counter() doesn't > acquire directly lwlock but I think that it's appropriate to do the > existence check for slots in pgstatfunc.c rather than pgstat.c. Thanks for pointing that out. It makes sense to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 7:11 PM Amit Kapila wrote: > > On Wed, Apr 21, 2021 at 3:39 PM Masahiko Sawada wrote: > > > > > > > > The test is not waiting for a new slot creation message to reach the > > > stats collector. So, if the old slot data still exists in the file and > > > now when we read stats via backend, then won't there exists a chance > > > that old slot stats data still exists? > > > > You're right. We should wait for the message to reach the collector. > > Or should we remove that test case? > > > > I feel we can remove it. I am not sure how much value this additional > test case is adding. Okay, removed. I’ve attached the updated patch. Please review it. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ From d3d6e7e2c059423c2277d0cc152077a063938391 Mon Sep 17 00:00:00 2001 From: Masahiko Sawada Date: Mon, 19 Apr 2021 16:40:27 +0900 Subject: [PATCH v12] Use HTAB for replication slot statistics. Previously, we used to use the array to store stats for replication slots. But this had two problems in the cases where a message for dropping a slot gets lost, which leaves the stats for already-dropped slot in the array: 1) the stats for the new slot are not recorded if the array is full and 2) writing beyond the end of the array when after restarting the number of slots whose stats are stored in the stats file exceeds max_replication_slots. This commit changes it to HTAB for replication slot statistics, resolving both problems. Instead, we have pgstat_vacuum_stat() search for all the dead replication slots in stats hashtable and tell the collector to remove them. To not show the stats for the already-dropped slots, pg_replication_slots view searches slot stats by the slot name taken from pg_replication_slots. Also, we send a message for creating a slot at slot creation, initializing the stats. This reduces the possibility that the stats are accumulated into the old slot stats when a message for dropping a slot gets lost. --- contrib/test_decoding/t/001_repl_stats.pl | 54 +++- src/backend/catalog/system_views.sql | 30 +- src/backend/postmaster/pgstat.c | 319 -- src/backend/replication/logical/logical.c | 2 +- src/backend/replication/slot.c| 26 +- src/backend/utils/adt/pgstatfuncs.c | 145 ++ src/include/catalog/pg_proc.dat | 14 +- src/include/pgstat.h | 10 +- src/include/replication/slot.h| 2 +- src/test/regress/expected/rules.out | 4 +- src/tools/pgindent/typedefs.list | 2 +- 11 files changed, 359 insertions(+), 249 deletions(-) diff --git a/contrib/test_decoding/t/001_repl_stats.pl b/contrib/test_decoding/t/001_repl_stats.pl index 11b6cd9b9c..ad19ad83c6 100644 --- a/contrib/test_decoding/t/001_repl_stats.pl +++ b/contrib/test_decoding/t/001_repl_stats.pl @@ -2,9 +2,10 @@ # drop replication slot and restart. use strict; use warnings; +use File::Path qw(rmtree); use PostgresNode; use TestLib; -use Test::More tests => 1; +use Test::More tests => 2; # Test set-up my $node = get_new_node('test'); @@ -12,6 +13,20 @@ $node->init(allows_streaming => 'logical'); $node->append_conf('postgresql.conf', 'synchronous_commit = on'); $node->start; +# Check that replicatoin slot stats are expected. +sub test_slot_stats +{ + my ($node, $expected, $msg) = @_; + + my $result = $node->safe_psql( + 'postgres', qq[ + SELECT slot_name, total_txns > 0 AS total_txn, + total_bytes > 0 AS total_bytes + FROM pg_stat_replication_slots + ORDER BY slot_name]); + is($result, $expected, $msg); +} + # Create table. $node->safe_psql('postgres', "CREATE TABLE test_repl_stat(col1 int)"); @@ -57,20 +72,41 @@ $node->start; # Verify statistics data present in pg_stat_replication_slots are sane after # restart. -my $result = $node->safe_psql('postgres', - "SELECT slot_name, total_txns > 0 AS total_txn, - total_bytes > 0 AS total_bytes FROM pg_stat_replication_slots - ORDER BY slot_name" -); -is($result, qq(regression_slot1|t|t +test_slot_stats( + $node, + qq(regression_slot1|t|t regression_slot2|t|t -regression_slot3|t|t), 'check replication statistics are updated'); +regression_slot3|t|t), + 'check replication statistics are updated'); + +# Test to remove one of the replication slots and adjust +# max_replication_slots accordingly to the number of slots. This leads +# to a mismatch between the number of slots present in the stats file and the +# number of stats present in the shared memory, simulating the scenario for +# drop slot message lost by the statistics collector process. We verify +# replication statistics data is fine after restart. + +$node->stop; +my $datadir = $node->data_dir; +my $slot3_replslotdir = "$datadir/pg_replslot/regression_slot3"; + +rmtree($slot3_replslotdir); + +$node->append_conf('postgresql.conf', 'max_replication_slots = 2'); +$node->start; + +# Verify statistics data present in
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 4:44 PM Dilip Kumar wrote: > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada wrote: > > > > > I've attached the patch. In addition to the test Vignesh prepared, I > > added one test for the message for creating a slot that checks if the > > statistics are initialized after re-creating the same name slot. > > Please review it. > > Overall the patch looks good to me. However, I have one question, I > did not understand the reason behind moving the below code from > "pgstat_reset_replslot_counter" to "pg_stat_reset_replication_slot"? Andres pointed out that pgstat_reset_replslot_counter() acquires lwlock[1]: --- - pgstat_reset_replslot_counter() acquires ReplicationSlotControlLock. I think pgstat.c has absolutely no business doing things on that level. --- I changed the code so that pgstat_reset_replslot_counter() doesn't acquire directly lwlock but I think that it's appropriate to do the existence check for slots in pgstatfunc.c rather than pgstat.c. Regards, [1] https://www.postgresql.org/message-id/20210319185247.ldebgpdaxsowiflw%40alap3.anarazel.de -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 3:39 PM Masahiko Sawada wrote: > > > > > The test is not waiting for a new slot creation message to reach the > > stats collector. So, if the old slot data still exists in the file and > > now when we read stats via backend, then won't there exists a chance > > that old slot stats data still exists? > > You're right. We should wait for the message to reach the collector. > Or should we remove that test case? > I feel we can remove it. I am not sure how much value this additional test case is adding. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 6:36 PM Amit Kapila wrote: > > On Wed, Apr 21, 2021 at 2:46 PM Masahiko Sawada wrote: > > > > On Wed, Apr 21, 2021 at 3:09 PM Amit Kapila wrote: > > > > > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada > > > wrote: > > > > > > > > > > > > I've attached the patch. In addition to the test Vignesh prepared, I > > > > added one test for the message for creating a slot that checks if the > > > > statistics are initialized after re-creating the same name slot. > > > > > > > > > > I am not sure how much useful your new test is because you are testing > > > it for slot name for which we have removed the slot file. It is not > > > related to stat messages this patch is sending. I think we can leave > > > that for now. > > > > I might be missing something but I think the test is related to the > > message for creating a slot that initializes all counters. No? If > > there is no that message, we will end up getting old stats if a > > message for dropping slot gets lost (simulated by dropping slot file) > > and the same name slot is created. > > > > The test is not waiting for a new slot creation message to reach the > stats collector. So, if the old slot data still exists in the file and > now when we read stats via backend, then won't there exists a chance > that old slot stats data still exists? You're right. We should wait for the message to reach the collector. Or should we remove that test case? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 6:20 PM Amit Kapila wrote: > > On Wed, Apr 21, 2021 at 2:37 PM Masahiko Sawada wrote: > > > > On Wed, Apr 21, 2021 at 12:50 PM Amit Kapila > > wrote: > > > > > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada > > > wrote: > > > > > > > > > > I have one question: > > > > > > + /* > > > + * Create the replication slot stats hash table if we don't have > > > + * it already. > > > + */ > > > + if (replSlotStats == NULL) > > > { > > > - if (namestrcmp([i].slotname, name) == 0) > > > - return i; /* found */ > > > + HASHCTL hash_ctl; > > > + > > > + hash_ctl.keysize = sizeof(NameData); > > > + hash_ctl.entrysize = sizeof(PgStat_StatReplSlotEntry); > > > + hash_ctl.hcxt = pgStatLocalContext; > > > + > > > + replSlotStats = hash_create("Replication slots hash", > > > + PGSTAT_REPLSLOT_HASH_SIZE, > > > + _ctl, > > > + HASH_ELEM | HASH_BLOBS | HASH_CONTEXT); > > > } > > > > > > It seems to me that the patch is always creating a hash table in > > > pgStatLocalContext? AFAIU, we need to create it in pgStatLocalContext > > > when we read stats via backend_read_statsfile so that we can clear it > > > at the end of the transaction. The db/function stats seems to be doing > > > the same. Is there a reason why here we need to always create it in > > > pgStatLocalContext? > > > > I wanted to avoid creating the hash table if there is no replication > > slot. But as you pointed out, we create the hash table even when > > lookup (i.g., create_it is false), which is bad. So I think we can > > have pgstat_get_replslot_entry() return NULL without creating the hash > > table if the hash table is NULL and create_it is false so that backend > > processes don’t create the hash table, not via > > backend_read_statsfile(). Or another idea would be to always create > > the hash table in pgstat_read_statsfiles(). That way, it would > > simplify the code but could waste the memory if there is no > > replication slot. > > > > If you create it after reading 'R' message as we do in the case of 'D' > message then it won't waste any memory. So probably creating in > pgstat_read_statsfiles() would be better unless you see some other > problem with that. Yeah, I think that's the approach I mentioned as the former. I’ll change in the next version patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 2:46 PM Masahiko Sawada wrote: > > On Wed, Apr 21, 2021 at 3:09 PM Amit Kapila wrote: > > > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada > > wrote: > > > > > > > > > I've attached the patch. In addition to the test Vignesh prepared, I > > > added one test for the message for creating a slot that checks if the > > > statistics are initialized after re-creating the same name slot. > > > > > > > I am not sure how much useful your new test is because you are testing > > it for slot name for which we have removed the slot file. It is not > > related to stat messages this patch is sending. I think we can leave > > that for now. > > I might be missing something but I think the test is related to the > message for creating a slot that initializes all counters. No? If > there is no that message, we will end up getting old stats if a > message for dropping slot gets lost (simulated by dropping slot file) > and the same name slot is created. > The test is not waiting for a new slot creation message to reach the stats collector. So, if the old slot data still exists in the file and now when we read stats via backend, then won't there exists a chance that old slot stats data still exists? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 2:37 PM Masahiko Sawada wrote: > > On Wed, Apr 21, 2021 at 12:50 PM Amit Kapila wrote: > > > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada > > wrote: > > > > > > > I have one question: > > > > + /* > > + * Create the replication slot stats hash table if we don't have > > + * it already. > > + */ > > + if (replSlotStats == NULL) > > { > > - if (namestrcmp([i].slotname, name) == 0) > > - return i; /* found */ > > + HASHCTL hash_ctl; > > + > > + hash_ctl.keysize = sizeof(NameData); > > + hash_ctl.entrysize = sizeof(PgStat_StatReplSlotEntry); > > + hash_ctl.hcxt = pgStatLocalContext; > > + > > + replSlotStats = hash_create("Replication slots hash", > > + PGSTAT_REPLSLOT_HASH_SIZE, > > + _ctl, > > + HASH_ELEM | HASH_BLOBS | HASH_CONTEXT); > > } > > > > It seems to me that the patch is always creating a hash table in > > pgStatLocalContext? AFAIU, we need to create it in pgStatLocalContext > > when we read stats via backend_read_statsfile so that we can clear it > > at the end of the transaction. The db/function stats seems to be doing > > the same. Is there a reason why here we need to always create it in > > pgStatLocalContext? > > I wanted to avoid creating the hash table if there is no replication > slot. But as you pointed out, we create the hash table even when > lookup (i.g., create_it is false), which is bad. So I think we can > have pgstat_get_replslot_entry() return NULL without creating the hash > table if the hash table is NULL and create_it is false so that backend > processes don’t create the hash table, not via > backend_read_statsfile(). Or another idea would be to always create > the hash table in pgstat_read_statsfiles(). That way, it would > simplify the code but could waste the memory if there is no > replication slot. > If you create it after reading 'R' message as we do in the case of 'D' message then it won't waste any memory. So probably creating in pgstat_read_statsfiles() would be better unless you see some other problem with that. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 3:09 PM Amit Kapila wrote: > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada wrote: > > > > > > I've attached the patch. In addition to the test Vignesh prepared, I > > added one test for the message for creating a slot that checks if the > > statistics are initialized after re-creating the same name slot. > > > > I am not sure how much useful your new test is because you are testing > it for slot name for which we have removed the slot file. It is not > related to stat messages this patch is sending. I think we can leave > that for now. I might be missing something but I think the test is related to the message for creating a slot that initializes all counters. No? If there is no that message, we will end up getting old stats if a message for dropping slot gets lost (simulated by dropping slot file) and the same name slot is created. > One other minor comment: > > - * create the statistics for the replication slot. > + * create the statistics for the replication slot. In the cases where the > + * message for dropping the old slot gets lost and a slot with the same > + * name is created, since the stats will be initialized by the message > + * for creating the slot the statistics are not accumulated into the > + * old slot unless the messages for both creating and dropping slots with > + * the same name got lost. Just in case it happens, the user can reset > + * the particular stats by pg_stat_reset_replication_slot(). > > I think we can change it to something like: " XXX In case, the > messages for creation and drop slot of the same name get lost and > create happens before (auto)vacuum cleans up the dead slot, the stats > will be accumulated into the old slot. One can imagine having OIDs for > each slot to avoid the accumulation of stats but that doesn't seem > worth doing as in practice this won't happen frequently.". Also, I am > not sure after your recent change whether it is a good idea to mention > something in docs. What do you think? Both points make sense to me. I'll update the comment and remove the mention in the doc in the next version patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 12:50 PM Amit Kapila wrote: > > On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada wrote: > > > > I have one question: > > + /* > + * Create the replication slot stats hash table if we don't have > + * it already. > + */ > + if (replSlotStats == NULL) > { > - if (namestrcmp([i].slotname, name) == 0) > - return i; /* found */ > + HASHCTL hash_ctl; > + > + hash_ctl.keysize = sizeof(NameData); > + hash_ctl.entrysize = sizeof(PgStat_StatReplSlotEntry); > + hash_ctl.hcxt = pgStatLocalContext; > + > + replSlotStats = hash_create("Replication slots hash", > + PGSTAT_REPLSLOT_HASH_SIZE, > + _ctl, > + HASH_ELEM | HASH_BLOBS | HASH_CONTEXT); > } > > It seems to me that the patch is always creating a hash table in > pgStatLocalContext? AFAIU, we need to create it in pgStatLocalContext > when we read stats via backend_read_statsfile so that we can clear it > at the end of the transaction. The db/function stats seems to be doing > the same. Is there a reason why here we need to always create it in > pgStatLocalContext? I wanted to avoid creating the hash table if there is no replication slot. But as you pointed out, we create the hash table even when lookup (i.g., create_it is false), which is bad. So I think we can have pgstat_get_replslot_entry() return NULL without creating the hash table if the hash table is NULL and create_it is false so that backend processes don’t create the hash table, not via backend_read_statsfile(). Or another idea would be to always create the hash table in pgstat_read_statsfiles(). That way, it would simplify the code but could waste the memory if there is no replication slot. I slightly prefer the former but what do you think? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Replication slot stats misgivings
On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada wrote: > > I've attached the patch. In addition to the test Vignesh prepared, I > added one test for the message for creating a slot that checks if the > statistics are initialized after re-creating the same name slot. > Please review it. Overall the patch looks good to me. However, I have one question, I did not understand the reason behind moving the below code from "pgstat_reset_replslot_counter" to "pg_stat_reset_replication_slot"? +/* + * Check if the slot exists with the given name. It is possible that by + * the time this message is executed the slot is dropped but at least + * this check will ensure that the given name is for a valid slot. + */ +slot = SearchNamedReplicationSlot(target, true); + +if (!slot) +ereport(ERROR, +(errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("replication slot \"%s\" does not exist", +target))); + +/* + * Nothing to do for physical slots as we collect stats only for + * logical slots. + */ +if (SlotIsPhysical(slot)) +PG_RETURN_VOID(); +} + pgstat_reset_replslot_counter(target); -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Replication slot stats misgivings
On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada wrote: > > > I've attached the patch. In addition to the test Vignesh prepared, I > added one test for the message for creating a slot that checks if the > statistics are initialized after re-creating the same name slot. > I am not sure how much useful your new test is because you are testing it for slot name for which we have removed the slot file. It is not related to stat messages this patch is sending. I think we can leave that for now. One other minor comment: - * create the statistics for the replication slot. + * create the statistics for the replication slot. In the cases where the + * message for dropping the old slot gets lost and a slot with the same + * name is created, since the stats will be initialized by the message + * for creating the slot the statistics are not accumulated into the + * old slot unless the messages for both creating and dropping slots with + * the same name got lost. Just in case it happens, the user can reset + * the particular stats by pg_stat_reset_replication_slot(). I think we can change it to something like: " XXX In case, the messages for creation and drop slot of the same name get lost and create happens before (auto)vacuum cleans up the dead slot, the stats will be accumulated into the old slot. One can imagine having OIDs for each slot to avoid the accumulation of stats but that doesn't seem worth doing as in practice this won't happen frequently.". Also, I am not sure after your recent change whether it is a good idea to mention something in docs. What do you think? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 9:47 AM Amit Kapila wrote: > > On Wed, Apr 21, 2021 at 9:39 AM vignesh C wrote: > > > > I feel we can change CATALOG_VERSION_NO so that we will get this error > > "The database cluster was initialized with CATALOG_VERSION_NO > > 2021X, but the server was compiled with CATALOG_VERSION_NO > > 2021X." which will prevent the above issue. > > > > Right, but we normally do that just before commit. We might want to > mention it in the commit message just as a Note so that we don't > forget to bump it before commit but otherwise, we don't need to change > it in the patch. > Yes, that is fine with me. Regards, Vignesh
Re: Replication slot stats misgivings
On Wed, Apr 21, 2021 at 9:39 AM vignesh C wrote: > > I feel we can change CATALOG_VERSION_NO so that we will get this error > "The database cluster was initialized with CATALOG_VERSION_NO > 2021X, but the server was compiled with CATALOG_VERSION_NO > 2021X." which will prevent the above issue. > Right, but we normally do that just before commit. We might want to mention it in the commit message just as a Note so that we don't forget to bump it before commit but otherwise, we don't need to change it in the patch. -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada wrote: > > On Tue, Apr 20, 2021 at 7:22 PM vignesh C wrote: > > > > On Tue, Apr 20, 2021 at 9:08 AM Masahiko Sawada > > wrote: > > > > > > On Mon, Apr 19, 2021 at 4:48 PM Masahiko Sawada > > > wrote: > > > > > > > > On Mon, Apr 19, 2021 at 2:14 PM Amit Kapila > > > > wrote: > > > > > > > > > > On Mon, Apr 19, 2021 at 9:00 AM Masahiko Sawada > > > > > wrote: > > > > > > > > > > > > On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > 4. > > > > > > > +CREATE VIEW pg_stat_replication_slots AS > > > > > > > +SELECT > > > > > > > +s.slot_name, > > > > > > > +s.spill_txns, > > > > > > > +s.spill_count, > > > > > > > +s.spill_bytes, > > > > > > > +s.stream_txns, > > > > > > > +s.stream_count, > > > > > > > +s.stream_bytes, > > > > > > > +s.total_txns, > > > > > > > +s.total_bytes, > > > > > > > +s.stats_reset > > > > > > > +FROM pg_replication_slots as r, > > > > > > > +LATERAL pg_stat_get_replication_slot(slot_name) as s > > > > > > > +WHERE r.datoid IS NOT NULL; -- excluding physical slots > > > > > > > .. > > > > > > > .. > > > > > > > > > > > > > > -/* Get the statistics for the replication slots */ > > > > > > > +/* Get the statistics for the replication slot */ > > > > > > > Datum > > > > > > > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS) > > > > > > > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS) > > > > > > > { > > > > > > > #define PG_STAT_GET_REPLICATION_SLOT_COLS 10 > > > > > > > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo; > > > > > > > + text *slotname_text = PG_GETARG_TEXT_P(0); > > > > > > > + NameData slotname; > > > > > > > > > > > > > > I think with the above changes getting all the slot stats has > > > > > > > become > > > > > > > much costlier. Is there any reason why can't we get all the stats > > > > > > > from > > > > > > > the new hash_table in one shot and return them to the user? > > > > > > > > > > > > I think the advantage of this approach would be that it can avoid > > > > > > showing the stats for already-dropped slots. Like other statistics > > > > > > views such as pg_stat_all_tables and pg_stat_all_functions, > > > > > > searching > > > > > > the stats by the name got from pg_replication_slots can show only > > > > > > available slot stats even if the hash table has garbage slot stats. > > > > > > > > > > > > > > > > Sounds reasonable. However, if the create_slot message is missed, it > > > > > will show an empty row for it. See below: > > > > > > > > > > postgres=# select slot_name, total_txns from > > > > > pg_stat_replication_slots; > > > > > slot_name | total_txns > > > > > ---+ > > > > > s1| 0 > > > > > s2| 0 > > > > >| > > > > > (3 rows) > > > > > > > > > > Here, I have manually via debugger skipped sending the create_slot > > > > > message for the third slot and we are showing an empty for it. This > > > > > won't happen for pg_stat_all_tables, as it will set 0 or other initial > > > > > values in such a case. I think we need to address this case. > > > > > > > > Good catch. I think it's better to set 0 to all counters and NULL to > > > > reset_stats. > > > > > > > > > > > > > > > Given that pg_stat_replication_slots doesn’t show garbage slot stats > > > > > > even if it has, I thought we can avoid checking those garbage stats > > > > > > frequently. It should not essentially be a problem for the hash > > > > > > table > > > > > > to have entries up to max_replication_slots regardless of live or > > > > > > already-dropped. > > > > > > > > > > > > > > > > Yeah, but I guess we still might not save much by not doing it, > > > > > especially because for the other cases like tables/functions, we are > > > > > doing it without any threshold limit. > > > > > > > > Agreed. > > > > > > > > I've attached the updated patch, please review it. > > > > > > I've attached the new version patch that fixed the compilation error > > > reported off-line by Amit. > > > > Thanks for the updated patch, few comments: > > Thank you for the review comments. > > > 1) We can change "slotent = pgstat_get_replslot_entry(slotname, > > false);" to "return pgstat_get_replslot_entry(slotname, false);" and > > remove the slotent variable. > > > > + PgStat_StatReplSlotEntry *slotent = NULL; > > + > > backend_read_statsfile(); > > > > - *nslots_p = nReplSlotStats; > > - return replSlotStats; > > + slotent = pgstat_get_replslot_entry(slotname, false); > > + > > + return slotent; > > Fixed. > > > > > 2) Should we change PGSTAT_FILE_FORMAT_ID as the statistic file format > > has changed for replication statistics? > > The struct name is changed but I think the statistics file format has > not changed by this patch. No? I
Re: Replication slot stats misgivings
On Tue, Apr 20, 2021 at 7:54 PM Masahiko Sawada wrote: > I have one question: + /* + * Create the replication slot stats hash table if we don't have + * it already. + */ + if (replSlotStats == NULL) { - if (namestrcmp([i].slotname, name) == 0) - return i; /* found */ + HASHCTL hash_ctl; + + hash_ctl.keysize = sizeof(NameData); + hash_ctl.entrysize = sizeof(PgStat_StatReplSlotEntry); + hash_ctl.hcxt = pgStatLocalContext; + + replSlotStats = hash_create("Replication slots hash", + PGSTAT_REPLSLOT_HASH_SIZE, + _ctl, + HASH_ELEM | HASH_BLOBS | HASH_CONTEXT); } It seems to me that the patch is always creating a hash table in pgStatLocalContext? AFAIU, we need to create it in pgStatLocalContext when we read stats via backend_read_statsfile so that we can clear it at the end of the transaction. The db/function stats seems to be doing the same. Is there a reason why here we need to always create it in pgStatLocalContext? -- With Regards, Amit Kapila.
Re: Replication slot stats misgivings
On Tue, Apr 20, 2021 at 7:22 PM vignesh C wrote: > > On Tue, Apr 20, 2021 at 9:08 AM Masahiko Sawada wrote: > > > > On Mon, Apr 19, 2021 at 4:48 PM Masahiko Sawada > > wrote: > > > > > > On Mon, Apr 19, 2021 at 2:14 PM Amit Kapila > > > wrote: > > > > > > > > On Mon, Apr 19, 2021 at 9:00 AM Masahiko Sawada > > > > wrote: > > > > > > > > > > On Fri, Apr 16, 2021 at 2:58 PM Amit Kapila > > > > > wrote: > > > > > > > > > > > > > > > > > > 4. > > > > > > +CREATE VIEW pg_stat_replication_slots AS > > > > > > +SELECT > > > > > > +s.slot_name, > > > > > > +s.spill_txns, > > > > > > +s.spill_count, > > > > > > +s.spill_bytes, > > > > > > +s.stream_txns, > > > > > > +s.stream_count, > > > > > > +s.stream_bytes, > > > > > > +s.total_txns, > > > > > > +s.total_bytes, > > > > > > +s.stats_reset > > > > > > +FROM pg_replication_slots as r, > > > > > > +LATERAL pg_stat_get_replication_slot(slot_name) as s > > > > > > +WHERE r.datoid IS NOT NULL; -- excluding physical slots > > > > > > .. > > > > > > .. > > > > > > > > > > > > -/* Get the statistics for the replication slots */ > > > > > > +/* Get the statistics for the replication slot */ > > > > > > Datum > > > > > > -pg_stat_get_replication_slots(PG_FUNCTION_ARGS) > > > > > > +pg_stat_get_replication_slot(PG_FUNCTION_ARGS) > > > > > > { > > > > > > #define PG_STAT_GET_REPLICATION_SLOT_COLS 10 > > > > > > - ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo; > > > > > > + text *slotname_text = PG_GETARG_TEXT_P(0); > > > > > > + NameData slotname; > > > > > > > > > > > > I think with the above changes getting all the slot stats has become > > > > > > much costlier. Is there any reason why can't we get all the stats > > > > > > from > > > > > > the new hash_table in one shot and return them to the user? > > > > > > > > > > I think the advantage of this approach would be that it can avoid > > > > > showing the stats for already-dropped slots. Like other statistics > > > > > views such as pg_stat_all_tables and pg_stat_all_functions, searching > > > > > the stats by the name got from pg_replication_slots can show only > > > > > available slot stats even if the hash table has garbage slot stats. > > > > > > > > > > > > > Sounds reasonable. However, if the create_slot message is missed, it > > > > will show an empty row for it. See below: > > > > > > > > postgres=# select slot_name, total_txns from pg_stat_replication_slots; > > > > slot_name | total_txns > > > > ---+ > > > > s1| 0 > > > > s2| 0 > > > >| > > > > (3 rows) > > > > > > > > Here, I have manually via debugger skipped sending the create_slot > > > > message for the third slot and we are showing an empty for it. This > > > > won't happen for pg_stat_all_tables, as it will set 0 or other initial > > > > values in such a case. I think we need to address this case. > > > > > > Good catch. I think it's better to set 0 to all counters and NULL to > > > reset_stats. > > > > > > > > > > > > Given that pg_stat_replication_slots doesn’t show garbage slot stats > > > > > even if it has, I thought we can avoid checking those garbage stats > > > > > frequently. It should not essentially be a problem for the hash table > > > > > to have entries up to max_replication_slots regardless of live or > > > > > already-dropped. > > > > > > > > > > > > > Yeah, but I guess we still might not save much by not doing it, > > > > especially because for the other cases like tables/functions, we are > > > > doing it without any threshold limit. > > > > > > Agreed. > > > > > > I've attached the updated patch, please review it. > > > > I've attached the new version patch that fixed the compilation error > > reported off-line by Amit. > > Thanks for the updated patch, few comments: Thank you for the review comments. > 1) We can change "slotent = pgstat_get_replslot_entry(slotname, > false);" to "return pgstat_get_replslot_entry(slotname, false);" and > remove the slotent variable. > > + PgStat_StatReplSlotEntry *slotent = NULL; > + > backend_read_statsfile(); > > - *nslots_p = nReplSlotStats; > - return replSlotStats; > + slotent = pgstat_get_replslot_entry(slotname, false); > + > + return slotent; Fixed. > > 2) Should we change PGSTAT_FILE_FORMAT_ID as the statistic file format > has changed for replication statistics? The struct name is changed but I think the statistics file format has not changed by this patch. No? > > 3) We can include PgStat_StatReplSlotEntry in typedefs.lst and remove > PgStat_ReplSlotStats from typedefs.lst Fixed. > > 4) Few indentation issues are there, we can run pgindent on pgstat.c changes: > case 'R': > - if > (fread([nReplSlotStats], 1, > sizeof(PgStat_ReplSlotStats), fpin) > -
Re: Replication slot stats misgivings
On Tue, Apr 20, 2021 at 6:59 PM Amit Kapila wrote: > > On Tue, Apr 20, 2021 at 9:08 AM Masahiko Sawada wrote: > > > > I've attached the new version patch that fixed the compilation error > > reported off-line by Amit. > > > > I was thinking about whether we can someway avoid the below risk: > In case where the > + * message for dropping the old slot gets lost and a slot with the same > + * name is created, the stats will be accumulated into the old slots since > + * we use the slot name as the key. In that case, user can reset the > + * particular stats by pg_stat_reset_replication_slot(). > > What if we send a separate message for create slot such that the stats > collector will initialize the entries even if the previous drop > message is lost or came later? If we do that then if the drop message > is lost, the create with same name won't accumulate the stats and if > the drop came later, it will remove the newly created stats but > anyway, later stats from the same slot will again create the slot > entry in the hash table. Sounds good to me. There is still little chance to happen if messages for both creating and dropping slots with the same name got lost, but it's unlikely to happen in practice. > > Also, I think we can include the test case prepared by Vignesh in the email > [1]. > > Apart from the above, I have made few minor modifications in the attached > patch. > (a) + if (slotent->stat_reset_timestamp == 0 || !slotent) > I don't understand why second part of check is required? By this time > slotent will anyway have some valid value. > > (b) + slotent = (PgStat_StatReplSlotEntry *) hash_search(replSlotStats, > +(void *) , > +create_it ? HASH_ENTER : HASH_FIND, > +); > > It is better to use NameStr here. > > (c) made various changes in comments and some other cosmetic changes. All the above changes make sense to me. I'll submit the updated patch soon. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/