Re: [HACKERS] some review comments on logical rep code
On 4/29/17 00:33, Noah Misch wrote: > On Fri, Apr 28, 2017 at 02:13:48PM -0400, Peter Eisentraut wrote: >> On 4/28/17 01:01, Noah Misch wrote: >>> On Fri, Apr 28, 2017 at 01:55:48PM +0900, Masahiko Sawada wrote: On Fri, Apr 28, 2017 at 1:42 PM, Noah Misch wrote: > On Fri, Apr 28, 2017 at 06:37:09AM +0900, Fujii Masao wrote: >> Pushed. Thanks! > > Does this close the open item, or is there more to do? There is only one item remaining, and the patch is attached on here[1]. I guess reviewing that patch is almost done. [1] https://www.postgresql.org/message-id/CAHGQGwELzJrA4SDA4TsJGds4X-ykTOP%2By5hecsoQmQqzZf8T7A%40mail.gmail.com >>> >>> Thanks. Peter, This PostgreSQL 10 open item is past due for your status >>> update. Kindly send a status update within 24 hours, and include a date for >>> your subsequent status update. Refer to the policy on open item ownership: >>> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com >> >> I think the patch that Fujii Masao has proposed has found general >> agreement. I would recommend that he commits it as he sees fit. > > This is not a conforming status update, because it does not specify a date for > your next update. Everything related to this is fixed now. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
I have committed this version. I have omitted all the talk about 2PC. There are discussions ongoing about changing the transaction behavior of CREATE SUBSCRIPTION, which might interfere with that. If someone wants to rebase and propose the parts about 2PC separately, I don't object, but it can be handled separately. I will close this open item now, since I think we have covered everything that was originally reported. Please start new threads if there are any issues that were missed or that have been discovered during the discussion. On 4/23/17 22:18, Masahiko Sawada wrote: > On Sat, Apr 22, 2017 at 4:26 AM, Fujii Masao wrote: >> On Fri, Apr 21, 2017 at 4:02 PM, Masahiko Sawada >> wrote: >>> On Fri, Apr 21, 2017 at 1:19 AM, Fujii Masao wrote: On Tue, Apr 18, 2017 at 5:16 PM, Masahiko Sawada wrote: > On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI > wrote: >> Hi, >> >> Thank you for the revised version. >> >> At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada >> wrote in >> >>> On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada >>> wrote: On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI wrote: > At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada > wrote in > >> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao >> wrote: >> 1. >>> >>> In ApplyWorkerMain(), DatumGetInt32() should be used to get integer >>> value from the argument, instead of DatumGetObjectId(). >> >> Attached 001 patch fixes it. > > Hmm. I looked at the launcher side and found that it assigns bare > integer to bgw_main_arg. It also should use Int32GetDatum. Yeah, I agreed. Will fix it. >> >> Thanks. >> >> 2. >>> >>> No one resets on_commit_launcher_wakeup flag to false. >> >> Attached 002 patch makes on_commit_launcher_wakeup turn off after >> woke >> up the launcher process. > > It is omitting the abort case. Maybe it should be > AtEOXact_ApplyLauncher(bool commit)? Should we wake up the launcher process even when abort? >> >> No, I meant that on_commit_launcher_wakeup should be just reset >> without waking up launcher on abort. > > I understood. Sounds reasonable. ROLLBACK PREPARED should not wake up the launcher, but not even with the patch. >>> >>> Yeah, I've missed it. But on_commit_launcher_wakeup dosen't correspond >>> to a prepared transaction. Even COMMIT PREPARED might wake up the >>> launcher process but might not wake up it. ROLLBACK PREPARED is also >>> the same. For example, in the following situation we wake up the >>> launcher at COMMIT, not at COMMIT PREPARED. >>> >>> (BTW, CreateSubscription, is the only one function that sets >>> on_commit_launcher_wakeup for now, cannot be called in a transaction >>> block. So it doesn't actually happen that we wake up the launcher when >>> a ROLLBACK PREPARED. But considering waking up the launcher by ALTER >>> SUBSCRIPTION ENABLE, we need to deal with it.) >> >> We can run CREATE SUBSCRIPTION with NOCREATE SLOT option >> within the transaction block. > > Oops, I'd missed it. > >> >>> >>> BEGIN; >>> ALTER SUBSCRIPTION hoge_sub ENABLE; >>> PREPARE TRANSACTION 'g'; >>> BEGIN; >>> SELECT 1; >>> COMMIT; -- wake up the launcher at this point. >>> COMMIT PREPARED 'g'; >>> >>> Even if we wake up the launcher even when a ROLLBACK PREPARED, it's >>> not a big deal to the launcher process actually. Compared with the >>> complexly of changing the logic so that on_commit_launcher_wakeup >>> corresponds to prepared transaction, we might want to accept this >>> behavior. >> >> on_commit_launcher_wakeup needs to be recoreded in 2PC state >> file to handle this issue properly. But I agree with you, that would >> be overkill for small gain. So I'm thinking to add the following >> comment into your patch and push it. Thought? >> >> - >> Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC >> case. >> For example, COMMIT PREPARED on the transaction enabling the flag >> doesn't wake up >> the logical replication launcher if ROLLBACK on another transaction >> runs before it. >> To handle this case properly, the flag needs to be recorded in 2PC >> state file so that >> COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up >> the launcher. However this is overkill for small gain and false wakeup >> of the launcher >> is not so harmful (probably we can live with that), so we do nothing >> here for this issue. >> - >> > > Agreed. > > I added this comment to the patch and attached it. > > Regards, > > -- > Masahiko Sawada > NIPPON TELEGRAPH AND TELEPHONE CORPORATION > NTT Open Source Software Center > > > > -- Peter Eisentraut http:/
Re: [HACKERS] some review comments on logical rep code
On Sat, Apr 29, 2017 at 12:33 AM, Noah Misch wrote: >> I think the patch that Fujii Masao has proposed has found general >> agreement. I would recommend that he commits it as he sees fit. > > This is not a conforming status update, because it does not specify a date for > your next update. If Peter thinks that the issues fixed by this patch don't really need to be corrected, then we could interpret this as a statement that it should not be listed as an open item but that he does not object to someone else fixing it anyway. If these are real issues, then Peter should take responsibility for them because they were created by his commit. If Fujii Masao is willing to help by committing the patch in question, great; but if not, then Peter should be responsible for committing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 28, 2017 at 02:13:48PM -0400, Peter Eisentraut wrote: > On 4/28/17 01:01, Noah Misch wrote: > > On Fri, Apr 28, 2017 at 01:55:48PM +0900, Masahiko Sawada wrote: > >> On Fri, Apr 28, 2017 at 1:42 PM, Noah Misch wrote: > >>> On Fri, Apr 28, 2017 at 06:37:09AM +0900, Fujii Masao wrote: > Pushed. Thanks! > >>> > >>> Does this close the open item, or is there more to do? > >> > >> There is only one item remaining, and the patch is attached on > >> here[1]. I guess reviewing that patch is almost done. > >> > >> [1] > >> https://www.postgresql.org/message-id/CAHGQGwELzJrA4SDA4TsJGds4X-ykTOP%2By5hecsoQmQqzZf8T7A%40mail.gmail.com > > > > Thanks. Peter, This PostgreSQL 10 open item is past due for your status > > update. Kindly send a status update within 24 hours, and include a date for > > your subsequent status update. Refer to the policy on open item ownership: > > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com > > I think the patch that Fujii Masao has proposed has found general > agreement. I would recommend that he commits it as he sees fit. This is not a conforming status update, because it does not specify a date for your next update. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On 4/28/17 01:01, Noah Misch wrote: > On Fri, Apr 28, 2017 at 01:55:48PM +0900, Masahiko Sawada wrote: >> On Fri, Apr 28, 2017 at 1:42 PM, Noah Misch wrote: >>> On Fri, Apr 28, 2017 at 06:37:09AM +0900, Fujii Masao wrote: Pushed. Thanks! >>> >>> Does this close the open item, or is there more to do? >> >> There is only one item remaining, and the patch is attached on >> here[1]. I guess reviewing that patch is almost done. >> >> [1] >> https://www.postgresql.org/message-id/CAHGQGwELzJrA4SDA4TsJGds4X-ykTOP%2By5hecsoQmQqzZf8T7A%40mail.gmail.com > > Thanks. Peter, This PostgreSQL 10 open item is past due for your status > update. Kindly send a status update within 24 hours, and include a date for > your subsequent status update. Refer to the policy on open item ownership: > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com I think the patch that Fujii Masao has proposed has found general agreement. I would recommend that he commits it as he sees fit. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 28, 2017 at 01:55:48PM +0900, Masahiko Sawada wrote: > On Fri, Apr 28, 2017 at 1:42 PM, Noah Misch wrote: > > On Fri, Apr 28, 2017 at 06:37:09AM +0900, Fujii Masao wrote: > >> Pushed. Thanks! > > > > Does this close the open item, or is there more to do? > > There is only one item remaining, and the patch is attached on > here[1]. I guess reviewing that patch is almost done. > > [1] > https://www.postgresql.org/message-id/CAHGQGwELzJrA4SDA4TsJGds4X-ykTOP%2By5hecsoQmQqzZf8T7A%40mail.gmail.com Thanks. Peter, This PostgreSQL 10 open item is past due for your status update. Kindly send a status update within 24 hours, and include a date for your subsequent status update. Refer to the policy on open item ownership: https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 28, 2017 at 1:42 PM, Noah Misch wrote: > On Fri, Apr 28, 2017 at 06:37:09AM +0900, Fujii Masao wrote: >> Pushed. Thanks! > > Does this close the open item, or is there more to do? There is only one item remaining, and the patch is attached on here[1]. I guess reviewing that patch is almost done. [1] https://www.postgresql.org/message-id/CAHGQGwELzJrA4SDA4TsJGds4X-ykTOP%2By5hecsoQmQqzZf8T7A%40mail.gmail.com Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 28, 2017 at 06:37:09AM +0900, Fujii Masao wrote: > Pushed. Thanks! Does this close the open item, or is there more to do? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Thu, Apr 27, 2017 at 5:37 PM, Petr Jelinek wrote: > On 26/04/17 18:36, Fujii Masao wrote: >> On Thu, Apr 27, 2017 at 1:28 AM, Fujii Masao wrote: >>> On Wed, Apr 26, 2017 at 3:47 PM, Kyotaro HORIGUCHI >>> wrote: At Wed, 26 Apr 2017 14:31:12 +0900, Masahiko Sawada wrote in > On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek > wrote: >> On 26/04/17 01:01, Fujii Masao wrote: > However this is overkill for small gain and false wakeup of the > launcher is not so harmful (probably we can live with that), so > we do nothing here for this issue. I agree this as a whole. But I think that the main issue here is not false wakeups, but 'possible delay of launching new workers by 3 minutes at most' (this is centainly a kind of false wakeups, though). We can live with this failure when using two-paase commmit, but I think it shouldn't happen silently. How about providing AtPrepare_ApplyLauncher(void) like the follows and calling it in PrepareTransaction? >>> >>> Or we should apply the attached patch and handle the 2PC case properly? >>> I was thinking that it's overkill more than necessary, but that seems >>> not true >>> as far as I implement that. >>> >> Looks like it does not even increase size of the 2pc file, +1 for this. > > In my honest opinion, I didn't have a big will that we should handle > even two-phase commit case, because this case is very rare (I could > not image such case) and doesn't mean to lead a harmful result such as > crash of server and returning inconsistent result. it just delays the > launching worker for at most 3 minutes. We also can deal with this for > example by making maximum nap time of apply launcher user-configurable > and document it. > But if we can deal with it by minimum changes like attached your patch I > agree. This change looks reasonable to me, +1 from me to this. The patch reads on_commit_launcher_wakeup directly then updates it via ApplyALuncherWakeupAtCommit() but it's too much to add a function for the sake of this. >>> >>> OK, so what about the attached patch? I replaced all the calls to >>> ApplyLauncherWakeupAtCommit() with the code "on_commit_launcher_wakeup = >>> true". >> >> BTW, while I was reading the code to implement the patch that >> I posted upthread, I found that the following condition doesn't >> work as expected. This is because "last_start_time" is always 0. >> >> /* Limit the start retry to once a wal_retrieve_retry_interval */ >> if (TimestampDifferenceExceeds(last_start_time, now, >> wal_retrieve_retry_interval)) >> >> "last_start_time" is set to "now" when the launcher starts up new >> worker. But "last_start_time" is defined and always initialized with 0 >> at the beginning of the main loop in ApplyLauncherMain(), so >> the above condition becomes always true. This is obviously a bug. >> Attached patch fixes this issue. >> > > Yes, please go ahead and commit this. Pushed. Thanks! Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On 26/04/17 18:36, Fujii Masao wrote: > On Thu, Apr 27, 2017 at 1:28 AM, Fujii Masao wrote: >> On Wed, Apr 26, 2017 at 3:47 PM, Kyotaro HORIGUCHI >> wrote: >>> At Wed, 26 Apr 2017 14:31:12 +0900, Masahiko Sawada >>> wrote in >>> On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek wrote: > On 26/04/17 01:01, Fujii Masao wrote: However this is overkill for small gain and false wakeup of the launcher is not so harmful (probably we can live with that), so we do nothing here for this issue. >>> >>> I agree this as a whole. But I think that the main issue here is >>> not false wakeups, but 'possible delay of launching new workers >>> by 3 minutes at most' (this is centainly a kind of false wakeups, >>> though). We can live with this failure when using two-paase >>> commmit, but I think it shouldn't happen silently. >>> >>> >>> How about providing AtPrepare_ApplyLauncher(void) like the >>> follows and calling it in PrepareTransaction? >> >> Or we should apply the attached patch and handle the 2PC case properly? >> I was thinking that it's overkill more than necessary, but that seems >> not true >> as far as I implement that. >> > Looks like it does not even increase size of the 2pc file, +1 for this. In my honest opinion, I didn't have a big will that we should handle even two-phase commit case, because this case is very rare (I could not image such case) and doesn't mean to lead a harmful result such as crash of server and returning inconsistent result. it just delays the launching worker for at most 3 minutes. We also can deal with this for example by making maximum nap time of apply launcher user-configurable and document it. But if we can deal with it by minimum changes like attached your patch I agree. >>> >>> This change looks reasonable to me, +1 from me to this. >>> >>> The patch reads on_commit_launcher_wakeup directly then updates >>> it via ApplyALuncherWakeupAtCommit() but it's too much to add a >>> function for the sake of this. >> >> OK, so what about the attached patch? I replaced all the calls to >> ApplyLauncherWakeupAtCommit() with the code "on_commit_launcher_wakeup = >> true". > > BTW, while I was reading the code to implement the patch that > I posted upthread, I found that the following condition doesn't > work as expected. This is because "last_start_time" is always 0. > > /* Limit the start retry to once a wal_retrieve_retry_interval */ > if (TimestampDifferenceExceeds(last_start_time, now, > wal_retrieve_retry_interval)) > > "last_start_time" is set to "now" when the launcher starts up new > worker. But "last_start_time" is defined and always initialized with 0 > at the beginning of the main loop in ApplyLauncherMain(), so > the above condition becomes always true. This is obviously a bug. > Attached patch fixes this issue. > Yes, please go ahead and commit this. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Thu, Apr 27, 2017 at 1:28 AM, Fujii Masao wrote: > On Wed, Apr 26, 2017 at 3:47 PM, Kyotaro HORIGUCHI > wrote: >> At Wed, 26 Apr 2017 14:31:12 +0900, Masahiko Sawada >> wrote in >> >>> On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek >>> wrote: >>> > On 26/04/17 01:01, Fujii Masao wrote: >>> However this is overkill for small gain and false wakeup of the >>> launcher is not so harmful (probably we can live with that), so >>> we do nothing here for this issue. >>> >>> >>> >>> I agree this as a whole. But I think that the main issue here is >>> >>> not false wakeups, but 'possible delay of launching new workers >>> >>> by 3 minutes at most' (this is centainly a kind of false wakeups, >>> >>> though). We can live with this failure when using two-paase >>> >>> commmit, but I think it shouldn't happen silently. >>> >>> >>> >>> >>> >>> How about providing AtPrepare_ApplyLauncher(void) like the >>> >>> follows and calling it in PrepareTransaction? >>> >> >>> >> Or we should apply the attached patch and handle the 2PC case properly? >>> >> I was thinking that it's overkill more than necessary, but that seems >>> >> not true >>> >> as far as I implement that. >>> >> >>> > Looks like it does not even increase size of the 2pc file, +1 for this. >>> >>> In my honest opinion, I didn't have a big will that we should handle >>> even two-phase commit case, because this case is very rare (I could >>> not image such case) and doesn't mean to lead a harmful result such as >>> crash of server and returning inconsistent result. it just delays the >>> launching worker for at most 3 minutes. We also can deal with this for >>> example by making maximum nap time of apply launcher user-configurable >>> and document it. >>> But if we can deal with it by minimum changes like attached your patch I >>> agree. >> >> This change looks reasonable to me, +1 from me to this. >> >> The patch reads on_commit_launcher_wakeup directly then updates >> it via ApplyALuncherWakeupAtCommit() but it's too much to add a >> function for the sake of this. > > OK, so what about the attached patch? I replaced all the calls to > ApplyLauncherWakeupAtCommit() with the code "on_commit_launcher_wakeup = > true". BTW, while I was reading the code to implement the patch that I posted upthread, I found that the following condition doesn't work as expected. This is because "last_start_time" is always 0. /* Limit the start retry to once a wal_retrieve_retry_interval */ if (TimestampDifferenceExceeds(last_start_time, now, wal_retrieve_retry_interval)) "last_start_time" is set to "now" when the launcher starts up new worker. But "last_start_time" is defined and always initialized with 0 at the beginning of the main loop in ApplyLauncherMain(), so the above condition becomes always true. This is obviously a bug. Attached patch fixes this issue. Regards, -- Fujii Masao bugfix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Wed, Apr 26, 2017 at 3:47 PM, Kyotaro HORIGUCHI wrote: > At Wed, 26 Apr 2017 14:31:12 +0900, Masahiko Sawada > wrote in > >> On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek >> wrote: >> > On 26/04/17 01:01, Fujii Masao wrote: >> However this is overkill for small gain and false wakeup of the >> launcher is not so harmful (probably we can live with that), so >> we do nothing here for this issue. >> >>> >> >>> I agree this as a whole. But I think that the main issue here is >> >>> not false wakeups, but 'possible delay of launching new workers >> >>> by 3 minutes at most' (this is centainly a kind of false wakeups, >> >>> though). We can live with this failure when using two-paase >> >>> commmit, but I think it shouldn't happen silently. >> >>> >> >>> >> >>> How about providing AtPrepare_ApplyLauncher(void) like the >> >>> follows and calling it in PrepareTransaction? >> >> >> >> Or we should apply the attached patch and handle the 2PC case properly? >> >> I was thinking that it's overkill more than necessary, but that seems not >> >> true >> >> as far as I implement that. >> >> >> > Looks like it does not even increase size of the 2pc file, +1 for this. >> >> In my honest opinion, I didn't have a big will that we should handle >> even two-phase commit case, because this case is very rare (I could >> not image such case) and doesn't mean to lead a harmful result such as >> crash of server and returning inconsistent result. it just delays the >> launching worker for at most 3 minutes. We also can deal with this for >> example by making maximum nap time of apply launcher user-configurable >> and document it. >> But if we can deal with it by minimum changes like attached your patch I >> agree. > > This change looks reasonable to me, +1 from me to this. > > The patch reads on_commit_launcher_wakeup directly then updates > it via ApplyALuncherWakeupAtCommit() but it's too much to add a > function for the sake of this. OK, so what about the attached patch? I replaced all the calls to ApplyLauncherWakeupAtCommit() with the code "on_commit_launcher_wakeup = true". Regards, -- Fujii Masao 002_Reset_on_commit_launcher_wakeup_v6.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
At Wed, 26 Apr 2017 14:31:12 +0900, Masahiko Sawada wrote in > On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek > wrote: > > On 26/04/17 01:01, Fujii Masao wrote: > However this is overkill for small gain and false wakeup of the > launcher is not so harmful (probably we can live with that), so > we do nothing here for this issue. > >>> > >>> I agree this as a whole. But I think that the main issue here is > >>> not false wakeups, but 'possible delay of launching new workers > >>> by 3 minutes at most' (this is centainly a kind of false wakeups, > >>> though). We can live with this failure when using two-paase > >>> commmit, but I think it shouldn't happen silently. > >>> > >>> > >>> How about providing AtPrepare_ApplyLauncher(void) like the > >>> follows and calling it in PrepareTransaction? > >> > >> Or we should apply the attached patch and handle the 2PC case properly? > >> I was thinking that it's overkill more than necessary, but that seems not > >> true > >> as far as I implement that. > >> > > Looks like it does not even increase size of the 2pc file, +1 for this. > > In my honest opinion, I didn't have a big will that we should handle > even two-phase commit case, because this case is very rare (I could > not image such case) and doesn't mean to lead a harmful result such as > crash of server and returning inconsistent result. it just delays the > launching worker for at most 3 minutes. We also can deal with this for > example by making maximum nap time of apply launcher user-configurable > and document it. > But if we can deal with it by minimum changes like attached your patch I > agree. This change looks reasonable to me, +1 from me to this. The patch reads on_commit_launcher_wakeup directly then updates it via ApplyALuncherWakeupAtCommit() but it's too much to add a function for the sake of this. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek wrote: > On 26/04/17 01:01, Fujii Masao wrote: >> On Mon, Apr 24, 2017 at 7:57 PM, Kyotaro HORIGUCHI >> wrote: >>> Hello, >>> >>> At Mon, 24 Apr 2017 11:18:32 +0900, Masahiko Sawada >>> wrote in >>> >> BEGIN; >> ALTER SUBSCRIPTION hoge_sub ENABLE; >> PREPARE TRANSACTION 'g'; >> BEGIN; >> SELECT 1; >> COMMIT; -- wake up the launcher at this point. >> COMMIT PREPARED 'g'; >> >> Even if we wake up the launcher even when a ROLLBACK PREPARED, it's >> not a big deal to the launcher process actually. Compared with the >> complexly of changing the logic so that on_commit_launcher_wakeup >> corresponds to prepared transaction, we might want to accept this >> behavior. > > on_commit_launcher_wakeup needs to be recoreded in 2PC state > file to handle this issue properly. But I agree with you, that would > be overkill for small gain. So I'm thinking to add the following > comment into your patch and push it. Thought? > > - > Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC > case. > For example, COMMIT PREPARED on the transaction enabling the flag > doesn't wake up > the logical replication launcher if ROLLBACK on another transaction > runs before it. > To handle this case properly, the flag needs to be recorded in 2PC > state file so that > COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up > the launcher. However this is overkill for small gain and false wakeup > of the launcher > is not so harmful (probably we can live with that), so we do nothing > here for this issue. > - > Agreed. I added this comment to the patch and attached it. >>> >>> The following "However" may need a follwoing comma. >>> However this is overkill for small gain and false wakeup of the launcher is not so harmful (probably we can live with that), so we do nothing here for this issue. >>> >>> I agree this as a whole. But I think that the main issue here is >>> not false wakeups, but 'possible delay of launching new workers >>> by 3 minutes at most' (this is centainly a kind of false wakeups, >>> though). We can live with this failure when using two-paase >>> commmit, but I think it shouldn't happen silently. >>> >>> >>> How about providing AtPrepare_ApplyLauncher(void) like the >>> follows and calling it in PrepareTransaction? >> >> Or we should apply the attached patch and handle the 2PC case properly? >> I was thinking that it's overkill more than necessary, but that seems not >> true >> as far as I implement that. >> In my honest opinion, I didn't have a big will that we should handle even two-phase commit case, because this case is very rare (I could not image such case) and doesn't mean to lead a harmful result such as crash of server and returning inconsistent result. it just delays the launching worker for at most 3 minutes. We also can deal with this for example by making maximum nap time of apply launcher user-configurable and document it. But if we can deal with it by minimum changes like attached your patch I agree. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On 26/04/17 01:01, Fujii Masao wrote: > On Mon, Apr 24, 2017 at 7:57 PM, Kyotaro HORIGUCHI > wrote: >> Hello, >> >> At Mon, 24 Apr 2017 11:18:32 +0900, Masahiko Sawada >> wrote in > BEGIN; > ALTER SUBSCRIPTION hoge_sub ENABLE; > PREPARE TRANSACTION 'g'; > BEGIN; > SELECT 1; > COMMIT; -- wake up the launcher at this point. > COMMIT PREPARED 'g'; > > Even if we wake up the launcher even when a ROLLBACK PREPARED, it's > not a big deal to the launcher process actually. Compared with the > complexly of changing the logic so that on_commit_launcher_wakeup > corresponds to prepared transaction, we might want to accept this > behavior. on_commit_launcher_wakeup needs to be recoreded in 2PC state file to handle this issue properly. But I agree with you, that would be overkill for small gain. So I'm thinking to add the following comment into your patch and push it. Thought? - Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC case. For example, COMMIT PREPARED on the transaction enabling the flag doesn't wake up the logical replication launcher if ROLLBACK on another transaction runs before it. To handle this case properly, the flag needs to be recorded in 2PC state file so that COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up the launcher. However this is overkill for small gain and false wakeup of the launcher is not so harmful (probably we can live with that), so we do nothing here for this issue. - >>> >>> Agreed. >>> >>> I added this comment to the patch and attached it. >> >> The following "However" may need a follwoing comma. >> >>> However this is overkill for small gain and false wakeup of the >>> launcher is not so harmful (probably we can live with that), so >>> we do nothing here for this issue. >> >> I agree this as a whole. But I think that the main issue here is >> not false wakeups, but 'possible delay of launching new workers >> by 3 minutes at most' (this is centainly a kind of false wakeups, >> though). We can live with this failure when using two-paase >> commmit, but I think it shouldn't happen silently. >> >> >> How about providing AtPrepare_ApplyLauncher(void) like the >> follows and calling it in PrepareTransaction? > > Or we should apply the attached patch and handle the 2PC case properly? > I was thinking that it's overkill more than necessary, but that seems not true > as far as I implement that. > Looks like it does not even increase size of the 2pc file, +1 for this. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Mon, Apr 24, 2017 at 7:57 PM, Kyotaro HORIGUCHI wrote: > Hello, > > At Mon, 24 Apr 2017 11:18:32 +0900, Masahiko Sawada > wrote in >> >> BEGIN; >> >> ALTER SUBSCRIPTION hoge_sub ENABLE; >> >> PREPARE TRANSACTION 'g'; >> >> BEGIN; >> >> SELECT 1; >> >> COMMIT; -- wake up the launcher at this point. >> >> COMMIT PREPARED 'g'; >> >> >> >> Even if we wake up the launcher even when a ROLLBACK PREPARED, it's >> >> not a big deal to the launcher process actually. Compared with the >> >> complexly of changing the logic so that on_commit_launcher_wakeup >> >> corresponds to prepared transaction, we might want to accept this >> >> behavior. >> > >> > on_commit_launcher_wakeup needs to be recoreded in 2PC state >> > file to handle this issue properly. But I agree with you, that would >> > be overkill for small gain. So I'm thinking to add the following >> > comment into your patch and push it. Thought? >> > >> > - >> > Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC >> > case. >> > For example, COMMIT PREPARED on the transaction enabling the flag >> > doesn't wake up >> > the logical replication launcher if ROLLBACK on another transaction >> > runs before it. >> > To handle this case properly, the flag needs to be recorded in 2PC >> > state file so that >> > COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up >> > the launcher. However this is overkill for small gain and false wakeup >> > of the launcher >> > is not so harmful (probably we can live with that), so we do nothing >> > here for this issue. >> > - >> > >> >> Agreed. >> >> I added this comment to the patch and attached it. > > The following "However" may need a follwoing comma. > >> However this is overkill for small gain and false wakeup of the >> launcher is not so harmful (probably we can live with that), so >> we do nothing here for this issue. > > I agree this as a whole. But I think that the main issue here is > not false wakeups, but 'possible delay of launching new workers > by 3 minutes at most' (this is centainly a kind of false wakeups, > though). We can live with this failure when using two-paase > commmit, but I think it shouldn't happen silently. > > > How about providing AtPrepare_ApplyLauncher(void) like the > follows and calling it in PrepareTransaction? Or we should apply the attached patch and handle the 2PC case properly? I was thinking that it's overkill more than necessary, but that seems not true as far as I implement that. Regards, -- Fujii Masao 002_Reset_on_commit_launcher_wakeup_v5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
At Tue, 25 Apr 2017 10:11:06 +0900, Masahiko Sawada wrote in > On Mon, Apr 24, 2017 at 7:57 PM, Kyotaro HORIGUCHI > wrote: > > Hello, > > > > At Mon, 24 Apr 2017 11:18:32 +0900, Masahiko Sawada > > wrote in > > > >> >> BEGIN; > >> >> ALTER SUBSCRIPTION hoge_sub ENABLE; > >> >> PREPARE TRANSACTION 'g'; > >> >> BEGIN; > >> >> SELECT 1; > >> >> COMMIT; -- wake up the launcher at this point. > >> >> COMMIT PREPARED 'g'; > >> >> > >> >> Even if we wake up the launcher even when a ROLLBACK PREPARED, it's > >> >> not a big deal to the launcher process actually. Compared with the > >> >> complexly of changing the logic so that on_commit_launcher_wakeup > >> >> corresponds to prepared transaction, we might want to accept this > >> >> behavior. > >> > > >> > on_commit_launcher_wakeup needs to be recoreded in 2PC state > >> > file to handle this issue properly. But I agree with you, that would > >> > be overkill for small gain. So I'm thinking to add the following > >> > comment into your patch and push it. Thought? > >> > > >> > - > >> > Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC > >> > case. > >> > For example, COMMIT PREPARED on the transaction enabling the flag > >> > doesn't wake up > >> > the logical replication launcher if ROLLBACK on another transaction > >> > runs before it. > >> > To handle this case properly, the flag needs to be recorded in 2PC > >> > state file so that > >> > COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up > >> > the launcher. However this is overkill for small gain and false wakeup > >> > of the launcher > >> > is not so harmful (probably we can live with that), so we do nothing > >> > here for this issue. > >> > - > >> > > >> > >> Agreed. > >> > >> I added this comment to the patch and attached it. > > > > The following "However" may need a follwoing comma. > > > >> However this is overkill for small gain and false wakeup of the > >> launcher is not so harmful (probably we can live with that), so > >> we do nothing here for this issue. > > > > I agree this as a whole. But I think that the main issue here is > > not false wakeups, but 'possible delay of launching new workers > > by 3 minutes at most' > > In what case does launching of the apply worker delay 3 minutes at most? In the "while (!got_SIGTERM)" loop in ApplyLauncherMain, if we tried to run enabled subscriptions but no subscription is enabled, it waits for 3 minutes. If we turn on a subscription but the trigger is consumed by the false wakeup, the loop sees no enabled subscription and will sleep for 3 minutes unless any other trigger comes. # I haven't tried that, though. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Mon, Apr 24, 2017 at 7:57 PM, Kyotaro HORIGUCHI wrote: > Hello, > > At Mon, 24 Apr 2017 11:18:32 +0900, Masahiko Sawada > wrote in >> >> BEGIN; >> >> ALTER SUBSCRIPTION hoge_sub ENABLE; >> >> PREPARE TRANSACTION 'g'; >> >> BEGIN; >> >> SELECT 1; >> >> COMMIT; -- wake up the launcher at this point. >> >> COMMIT PREPARED 'g'; >> >> >> >> Even if we wake up the launcher even when a ROLLBACK PREPARED, it's >> >> not a big deal to the launcher process actually. Compared with the >> >> complexly of changing the logic so that on_commit_launcher_wakeup >> >> corresponds to prepared transaction, we might want to accept this >> >> behavior. >> > >> > on_commit_launcher_wakeup needs to be recoreded in 2PC state >> > file to handle this issue properly. But I agree with you, that would >> > be overkill for small gain. So I'm thinking to add the following >> > comment into your patch and push it. Thought? >> > >> > - >> > Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC >> > case. >> > For example, COMMIT PREPARED on the transaction enabling the flag >> > doesn't wake up >> > the logical replication launcher if ROLLBACK on another transaction >> > runs before it. >> > To handle this case properly, the flag needs to be recorded in 2PC >> > state file so that >> > COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up >> > the launcher. However this is overkill for small gain and false wakeup >> > of the launcher >> > is not so harmful (probably we can live with that), so we do nothing >> > here for this issue. >> > - >> > >> >> Agreed. >> >> I added this comment to the patch and attached it. > > The following "However" may need a follwoing comma. > >> However this is overkill for small gain and false wakeup of the >> launcher is not so harmful (probably we can live with that), so >> we do nothing here for this issue. > > I agree this as a whole. But I think that the main issue here is > not false wakeups, but 'possible delay of launching new workers > by 3 minutes at most' In what case does launching of the apply worker delay 3 minutes at most? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
Hello, At Mon, 24 Apr 2017 11:18:32 +0900, Masahiko Sawada wrote in > >> BEGIN; > >> ALTER SUBSCRIPTION hoge_sub ENABLE; > >> PREPARE TRANSACTION 'g'; > >> BEGIN; > >> SELECT 1; > >> COMMIT; -- wake up the launcher at this point. > >> COMMIT PREPARED 'g'; > >> > >> Even if we wake up the launcher even when a ROLLBACK PREPARED, it's > >> not a big deal to the launcher process actually. Compared with the > >> complexly of changing the logic so that on_commit_launcher_wakeup > >> corresponds to prepared transaction, we might want to accept this > >> behavior. > > > > on_commit_launcher_wakeup needs to be recoreded in 2PC state > > file to handle this issue properly. But I agree with you, that would > > be overkill for small gain. So I'm thinking to add the following > > comment into your patch and push it. Thought? > > > > - > > Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC > > case. > > For example, COMMIT PREPARED on the transaction enabling the flag > > doesn't wake up > > the logical replication launcher if ROLLBACK on another transaction > > runs before it. > > To handle this case properly, the flag needs to be recorded in 2PC > > state file so that > > COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up > > the launcher. However this is overkill for small gain and false wakeup > > of the launcher > > is not so harmful (probably we can live with that), so we do nothing > > here for this issue. > > - > > > > Agreed. > > I added this comment to the patch and attached it. The following "However" may need a follwoing comma. > However this is overkill for small gain and false wakeup of the > launcher is not so harmful (probably we can live with that), so > we do nothing here for this issue. I agree this as a whole. But I think that the main issue here is not false wakeups, but 'possible delay of launching new workers by 3 minutes at most' (this is centainly a kind of false wakeups, though). We can live with this failure when using two-paase commmit, but I think it shouldn't happen silently. How about providing AtPrepare_ApplyLauncher(void) like the follows and calling it in PrepareTransaction? void AtPrepare_ApplyLauncher(void) { if (on_commit_launcher_wakeup) ereport(WARNING, (errmsg ("logical replication may not start immediately"), errhint("PREPARE TRANSACTION can hinder immediate lanching of logical replication worker."))); } regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Sat, Apr 22, 2017 at 4:26 AM, Fujii Masao wrote: > On Fri, Apr 21, 2017 at 4:02 PM, Masahiko Sawada > wrote: >> On Fri, Apr 21, 2017 at 1:19 AM, Fujii Masao wrote: >>> On Tue, Apr 18, 2017 at 5:16 PM, Masahiko Sawada >>> wrote: On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI wrote: > Hi, > > Thank you for the revised version. > > At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada > wrote in > >> On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada >> wrote: >> > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI >> > wrote: >> >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada >> >> wrote in >> >> >> >>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao >> >>> wrote: >> >>> 1. >> >>> > >> >>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer >> >>> > value from the argument, instead of DatumGetObjectId(). >> >>> >> >>> Attached 001 patch fixes it. >> >> >> >> Hmm. I looked at the launcher side and found that it assigns bare >> >> integer to bgw_main_arg. It also should use Int32GetDatum. >> > >> > Yeah, I agreed. Will fix it. > > Thanks. > >> >>> 2. >> >>> > >> >>> > No one resets on_commit_launcher_wakeup flag to false. >> >>> >> >>> Attached 002 patch makes on_commit_launcher_wakeup turn off after >> >>> woke >> >>> up the launcher process. >> >> >> >> It is omitting the abort case. Maybe it should be >> >> AtEOXact_ApplyLauncher(bool commit)? >> > >> > Should we wake up the launcher process even when abort? > > No, I meant that on_commit_launcher_wakeup should be just reset > without waking up launcher on abort. I understood. Sounds reasonable. >>> >>> ROLLBACK PREPARED should not wake up the launcher, but not even with the >>> patch. >> >> Yeah, I've missed it. But on_commit_launcher_wakeup dosen't correspond >> to a prepared transaction. Even COMMIT PREPARED might wake up the >> launcher process but might not wake up it. ROLLBACK PREPARED is also >> the same. For example, in the following situation we wake up the >> launcher at COMMIT, not at COMMIT PREPARED. >> >> (BTW, CreateSubscription, is the only one function that sets >> on_commit_launcher_wakeup for now, cannot be called in a transaction >> block. So it doesn't actually happen that we wake up the launcher when >> a ROLLBACK PREPARED. But considering waking up the launcher by ALTER >> SUBSCRIPTION ENABLE, we need to deal with it.) > > We can run CREATE SUBSCRIPTION with NOCREATE SLOT option > within the transaction block. Oops, I'd missed it. > >> >> BEGIN; >> ALTER SUBSCRIPTION hoge_sub ENABLE; >> PREPARE TRANSACTION 'g'; >> BEGIN; >> SELECT 1; >> COMMIT; -- wake up the launcher at this point. >> COMMIT PREPARED 'g'; >> >> Even if we wake up the launcher even when a ROLLBACK PREPARED, it's >> not a big deal to the launcher process actually. Compared with the >> complexly of changing the logic so that on_commit_launcher_wakeup >> corresponds to prepared transaction, we might want to accept this >> behavior. > > on_commit_launcher_wakeup needs to be recoreded in 2PC state > file to handle this issue properly. But I agree with you, that would > be overkill for small gain. So I'm thinking to add the following > comment into your patch and push it. Thought? > > - > Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC case. > For example, COMMIT PREPARED on the transaction enabling the flag > doesn't wake up > the logical replication launcher if ROLLBACK on another transaction > runs before it. > To handle this case properly, the flag needs to be recorded in 2PC > state file so that > COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up > the launcher. However this is overkill for small gain and false wakeup > of the launcher > is not so harmful (probably we can live with that), so we do nothing > here for this issue. > - > Agreed. I added this comment to the patch and attached it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center 002_Reset_on_commit_launcher_wakeup_v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 21, 2017 at 4:02 PM, Masahiko Sawada wrote: > On Fri, Apr 21, 2017 at 1:19 AM, Fujii Masao wrote: >> On Tue, Apr 18, 2017 at 5:16 PM, Masahiko Sawada >> wrote: >>> On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI >>> wrote: Hi, Thank you for the revised version. At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada wrote in > On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada > wrote: > > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI > > wrote: > >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada > >> wrote in > >> > >>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao > >>> wrote: > >>> 1. > >>> > > >>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer > >>> > value from the argument, instead of DatumGetObjectId(). > >>> > >>> Attached 001 patch fixes it. > >> > >> Hmm. I looked at the launcher side and found that it assigns bare > >> integer to bgw_main_arg. It also should use Int32GetDatum. > > > > Yeah, I agreed. Will fix it. Thanks. > >>> 2. > >>> > > >>> > No one resets on_commit_launcher_wakeup flag to false. > >>> > >>> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke > >>> up the launcher process. > >> > >> It is omitting the abort case. Maybe it should be > >> AtEOXact_ApplyLauncher(bool commit)? > > > > Should we wake up the launcher process even when abort? No, I meant that on_commit_launcher_wakeup should be just reset without waking up launcher on abort. >>> >>> I understood. Sounds reasonable. >> >> ROLLBACK PREPARED should not wake up the launcher, but not even with the >> patch. > > Yeah, I've missed it. But on_commit_launcher_wakeup dosen't correspond > to a prepared transaction. Even COMMIT PREPARED might wake up the > launcher process but might not wake up it. ROLLBACK PREPARED is also > the same. For example, in the following situation we wake up the > launcher at COMMIT, not at COMMIT PREPARED. > > (BTW, CreateSubscription, is the only one function that sets > on_commit_launcher_wakeup for now, cannot be called in a transaction > block. So it doesn't actually happen that we wake up the launcher when > a ROLLBACK PREPARED. But considering waking up the launcher by ALTER > SUBSCRIPTION ENABLE, we need to deal with it.) We can run CREATE SUBSCRIPTION with NOCREATE SLOT option within the transaction block. > > BEGIN; > ALTER SUBSCRIPTION hoge_sub ENABLE; > PREPARE TRANSACTION 'g'; > BEGIN; > SELECT 1; > COMMIT; -- wake up the launcher at this point. > COMMIT PREPARED 'g'; > > Even if we wake up the launcher even when a ROLLBACK PREPARED, it's > not a big deal to the launcher process actually. Compared with the > complexly of changing the logic so that on_commit_launcher_wakeup > corresponds to prepared transaction, we might want to accept this > behavior. on_commit_launcher_wakeup needs to be recoreded in 2PC state file to handle this issue properly. But I agree with you, that would be overkill for small gain. So I'm thinking to add the following comment into your patch and push it. Thought? - Note that on_commit_launcher_wakeup flag doesn't work as expected in 2PC case. For example, COMMIT PREPARED on the transaction enabling the flag doesn't wake up the logical replication launcher if ROLLBACK on another transaction runs before it. To handle this case properly, the flag needs to be recorded in 2PC state file so that COMMIT PREPARED and ROLLBACK PREPARED can determine whether to wake up the launcher. However this is overkill for small gain and false wakeup of the launcher is not so harmful (probably we can live with that), so we do nothing here for this issue. - Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 21, 2017 at 11:55 AM, Masahiko Sawada wrote: > On Fri, Apr 21, 2017 at 4:41 AM, Peter Eisentraut > wrote: >> On 4/20/17 12:30, Fujii Masao wrote: >>> I've pushed several patches, and there is now only one remaining patch. >>> I posted the review comment on that patch, and I'm expecting that >>> Masahiko-san will update the patch. So what about waiting for the updated >>> version of the patch by next Monday (April 24th)? >> >> Thank you for pitching in. >> >> Where is your latest patch? >> > > The remaining patch is 002_Reset_on_commit_launcher_wakeup_v3.patch Oops, there is one more remaining issue that reported on this thread and another one[1]. The latest patch is attached on [1]. [1] https://www.postgresql.org/message-id/20170406.212441.177025736.horiguchi.kyotaro%40lab.ntt.co.jp Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 21, 2017 at 1:19 AM, Fujii Masao wrote: > On Tue, Apr 18, 2017 at 5:16 PM, Masahiko Sawada > wrote: >> On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI >> wrote: >>> Hi, >>> >>> Thank you for the revised version. >>> >>> At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada >>> wrote in >>> On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada wrote: > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI > wrote: >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada >> wrote in >> >>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao >>> wrote: >>> 1. >>> > >>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer >>> > value from the argument, instead of DatumGetObjectId(). >>> >>> Attached 001 patch fixes it. >> >> Hmm. I looked at the launcher side and found that it assigns bare >> integer to bgw_main_arg. It also should use Int32GetDatum. > > Yeah, I agreed. Will fix it. >>> >>> Thanks. >>> >>> 2. >>> > >>> > No one resets on_commit_launcher_wakeup flag to false. >>> >>> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke >>> up the launcher process. >> >> It is omitting the abort case. Maybe it should be >> AtEOXact_ApplyLauncher(bool commit)? > > Should we wake up the launcher process even when abort? >>> >>> No, I meant that on_commit_launcher_wakeup should be just reset >>> without waking up launcher on abort. >> >> I understood. Sounds reasonable. > > ROLLBACK PREPARED should not wake up the launcher, but not even with the > patch. Yeah, I've missed it. But on_commit_launcher_wakeup dosen't correspond to a prepared transaction. Even COMMIT PREPARED might wake up the launcher process but might not wake up it. ROLLBACK PREPARED is also the same. For example, in the following situation we wake up the launcher at COMMIT, not at COMMIT PREPARED. (BTW, CreateSubscription, is the only one function that sets on_commit_launcher_wakeup for now, cannot be called in a transaction block. So it doesn't actually happen that we wake up the launcher when a ROLLBACK PREPARED. But considering waking up the launcher by ALTER SUBSCRIPTION ENABLE, we need to deal with it.) BEGIN; ALTER SUBSCRIPTION hoge_sub ENABLE; PREPARE TRANSACTION 'g'; BEGIN; SELECT 1; COMMIT; -- wake up the launcher at this point. COMMIT PREPARED 'g'; Even if we wake up the launcher even when a ROLLBACK PREPARED, it's not a big deal to the launcher process actually. Compared with the complexly of changing the logic so that on_commit_launcher_wakeup corresponds to prepared transaction, we might want to accept this behavior. > To fix this issue, we should call AtEOXact_ApplyLauncher() in > FinishPreparedTransaction() or somewhere? > Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
At Wed, 19 Apr 2017 10:59:00 +0200, Petr Jelinek wrote in <3ef9c831-0508-51a9-5ded-c2e31e958...@2ndquadrant.com> > On 19/04/17 10:45, Kyotaro HORIGUCHI wrote: > > At Wed, 19 Apr 2017 17:43:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI > > wrote in > > <20170419.174317.114509231.horiguchi.kyot...@lab.ntt.co.jp> > >> At Wed, 19 Apr 2017 10:33:29 +0200, Petr Jelinek > >> wrote in > >> > >> Some other process may modify it then go to there. With a kind of > >> priority inversion, the process may modify the data on the memory > >> *before* we modify it. Of course this is rather unrealistic, > >> though. > > > > A bit short. > > > > Some other process may modify it *after* we read it then go to > > there. With a kind of priority inversion, the process may modify > > the data on the memory *before* we modify it. Of course this is > > rather unrealistic, though. > > > > Yeah but I think that's risk anyway in MVCC read committed database, the > only way to protect against that would be to hold lock over the catalogs > access which was what we wanted to get rid of :) Agreed, or, I'm not sure about the real risks. > BTW the whole sync coordination is explicitly written in a way that > unless you mess with catalogs manually only the tablesync process should > do UPDATEs to that catalog (with the exception of s->r state switch > which can happen in apply and has no effect on sync because both states > mean that synchronization is done, only makes apply stop tracking the > table individually). Hmm. Inhibiting retrospective updates of the on-memory data would save from some kind of priority inversions but such kind of manual operation is similar to overwriting of pg_control and I think is not worth bothering about:p regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 21, 2017 at 4:41 AM, Peter Eisentraut wrote: > On 4/20/17 12:30, Fujii Masao wrote: >> I've pushed several patches, and there is now only one remaining patch. >> I posted the review comment on that patch, and I'm expecting that >> Masahiko-san will update the patch. So what about waiting for the updated >> version of the patch by next Monday (April 24th)? > > Thank you for pitching in. > > Where is your latest patch? > The remaining patch is 002_Reset_on_commit_launcher_wakeup_v3.patch attached on this mail[1]. I'll update this patch and send it. [1] https://www.postgresql.org/message-id/CAD21AoB1HiZV%2B%2Bah%3DVrQjVZoH%2Bb6Rn8Atv6Miz%2BHCp0j%2BL6GZQ%40mail.gmail.com Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On 4/20/17 12:30, Fujii Masao wrote: > I've pushed several patches, and there is now only one remaining patch. > I posted the review comment on that patch, and I'm expecting that > Masahiko-san will update the patch. So what about waiting for the updated > version of the patch by next Monday (April 24th)? Thank you for pitching in. Where is your latest patch? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Thu, Apr 20, 2017 at 12:05 PM, Noah Misch wrote: > On Sun, Apr 16, 2017 at 06:14:49AM +, Noah Misch wrote: >> On Fri, Apr 14, 2017 at 04:47:12AM +0900, Fujii Masao wrote: >> > Though I've read only a part of the logical rep code yet, I'd like to >> > share some (relatively minor) review comments that I got so far. >> > >> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer >> > value from the argument, instead of DatumGetObjectId(). >> > >> > No one resets on_commit_launcher_wakeup flag to false. >> > >> > ApplyLauncherWakeup() should be static function. >> > >> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's >> > subcommands (e.g., ENABLED one) should wake the launcher up on commit. >> > >> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? >> > >> > Normal statements that the walsender for logical rep runs are logged >> > by log_replication_commands. So if log_statement is also set, >> > those commands are logged twice. >> > >> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits >> > an error. The callback function to reset it needs to be registered >> > via on_shmem_exit(). >> > >> > Typo: "an subscriber" should be "a subscriber" in some places. >> > >> > /* Get conninfo */ >> > datum = SysCacheGetAttr(SUBSCRIPTIONOID, >> > tup, >> > Anum_pg_subscription_subconninfo, >> > &isnull); >> > Assert(!isnull); >> > >> > This assertion makes me think that pg_subscription.subconnifo should >> > have NOT NULL constraint. Also subslotname and subpublications >> > should have NOT NULL constraint because of the same reason. >> > >> > SpinLockAcquire(&MyLogicalRepWorker->relmutex); >> > MyLogicalRepWorker->relstate = >> > GetSubscriptionRelState(MyLogicalRepWorker->subid, >> > MyLogicalRepWorker->relid, >> > &MyLogicalRepWorker->relstate_lsn, >> > false); >> > SpinLockRelease(&MyLogicalRepWorker->relmutex); >> > >> > Non-"short-term" function like GetSubscriptionRelState() should not >> > be called while spinlock is being held. >> >> [Action required within three days. This is a generic notification.] >> >> The above-described topic is currently a PostgreSQL 10 open item. Peter, >> since you committed the patch believed to have created it, you own this open >> item. If some other commit is more relevant or if this does not belong as a >> v10 open item, please let us know. Otherwise, please observe the policy on >> open item ownership[1] and send a status update within three calendar days of >> this message. Include a date for your subsequent status update. Testers may >> discover new open items at any time, and I want to plan to get them all fixed >> well in advance of shipping v10. Consequently, I will appreciate your >> efforts >> toward speedy resolution. Thanks. >> >> [1] >> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com > > This PostgreSQL 10 open item is past due for your status update. Kindly send > a status update within 24 hours, and include a date for your subsequent status > update. Refer to the policy on open item ownership: > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com I'm not the owner of this item, but the current status is; I've pushed several patches, and there is now only one remaining patch. I posted the review comment on that patch, and I'm expecting that Masahiko-san will update the patch. So what about waiting for the updated version of the patch by next Monday (April 24th)? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Tue, Apr 18, 2017 at 5:16 PM, Masahiko Sawada wrote: > On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI > wrote: >> Hi, >> >> Thank you for the revised version. >> >> At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada >> wrote in >>> On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada >>> wrote: >>> > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI >>> > wrote: >>> >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada >>> >> wrote in >>> >> >>> >>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao >>> >>> wrote: >>> >>> 1. >>> >>> > >>> >>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer >>> >>> > value from the argument, instead of DatumGetObjectId(). >>> >>> >>> >>> Attached 001 patch fixes it. >>> >> >>> >> Hmm. I looked at the launcher side and found that it assigns bare >>> >> integer to bgw_main_arg. It also should use Int32GetDatum. >>> > >>> > Yeah, I agreed. Will fix it. >> >> Thanks. >> >>> >>> 2. >>> >>> > >>> >>> > No one resets on_commit_launcher_wakeup flag to false. >>> >>> >>> >>> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke >>> >>> up the launcher process. >>> >> >>> >> It is omitting the abort case. Maybe it should be >>> >> AtEOXact_ApplyLauncher(bool commit)? >>> > >>> > Should we wake up the launcher process even when abort? >> >> No, I meant that on_commit_launcher_wakeup should be just reset >> without waking up launcher on abort. > > I understood. Sounds reasonable. ROLLBACK PREPARED should not wake up the launcher, but not even with the patch. To fix this issue, we should call AtEOXact_ApplyLauncher() in FinishPreparedTransaction() or somewhere? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Sun, Apr 16, 2017 at 06:14:49AM +, Noah Misch wrote: > On Fri, Apr 14, 2017 at 04:47:12AM +0900, Fujii Masao wrote: > > Though I've read only a part of the logical rep code yet, I'd like to > > share some (relatively minor) review comments that I got so far. > > > > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer > > value from the argument, instead of DatumGetObjectId(). > > > > No one resets on_commit_launcher_wakeup flag to false. > > > > ApplyLauncherWakeup() should be static function. > > > > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's > > subcommands (e.g., ENABLED one) should wake the launcher up on commit. > > > > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? > > > > Normal statements that the walsender for logical rep runs are logged > > by log_replication_commands. So if log_statement is also set, > > those commands are logged twice. > > > > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits > > an error. The callback function to reset it needs to be registered > > via on_shmem_exit(). > > > > Typo: "an subscriber" should be "a subscriber" in some places. > > > > /* Get conninfo */ > > datum = SysCacheGetAttr(SUBSCRIPTIONOID, > > tup, > > Anum_pg_subscription_subconninfo, > > &isnull); > > Assert(!isnull); > > > > This assertion makes me think that pg_subscription.subconnifo should > > have NOT NULL constraint. Also subslotname and subpublications > > should have NOT NULL constraint because of the same reason. > > > > SpinLockAcquire(&MyLogicalRepWorker->relmutex); > > MyLogicalRepWorker->relstate = > > GetSubscriptionRelState(MyLogicalRepWorker->subid, > > MyLogicalRepWorker->relid, > > &MyLogicalRepWorker->relstate_lsn, > > false); > > SpinLockRelease(&MyLogicalRepWorker->relmutex); > > > > Non-"short-term" function like GetSubscriptionRelState() should not > > be called while spinlock is being held. > > [Action required within three days. This is a generic notification.] > > The above-described topic is currently a PostgreSQL 10 open item. Peter, > since you committed the patch believed to have created it, you own this open > item. If some other commit is more relevant or if this does not belong as a > v10 open item, please let us know. Otherwise, please observe the policy on > open item ownership[1] and send a status update within three calendar days of > this message. Include a date for your subsequent status update. Testers may > discover new open items at any time, and I want to plan to get them all fixed > well in advance of shipping v10. Consequently, I will appreciate your efforts > toward speedy resolution. Thanks. > > [1] > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com This PostgreSQL 10 open item is past due for your status update. Kindly send a status update within 24 hours, and include a date for your subsequent status update. Refer to the policy on open item ownership: https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On 19/04/17 10:45, Kyotaro HORIGUCHI wrote: > At Wed, 19 Apr 2017 17:43:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI > wrote in > <20170419.174317.114509231.horiguchi.kyot...@lab.ntt.co.jp> >> At Wed, 19 Apr 2017 10:33:29 +0200, Petr Jelinek >> wrote in >> Commit has been moved from after to before of the lock section. This causes potential race condition. (As the same as the potential dead-lock, I'm not sure it can cause realistic problem, though..) Isn't it better to be after the lock section? >>> >>> We just read catalogs so there should not be locking issues. >> >> Some other process may modify it then go to there. With a kind of >> priority inversion, the process may modify the data on the memory >> *before* we modify it. Of course this is rather unrealistic, >> though. > > A bit short. > > Some other process may modify it *after* we read it then go to > there. With a kind of priority inversion, the process may modify > the data on the memory *before* we modify it. Of course this is > rather unrealistic, though. > Yeah but I think that's risk anyway in MVCC read committed database, the only way to protect against that would be to hold lock over the catalogs access which was what we wanted to get rid of :) BTW the whole sync coordination is explicitly written in a way that unless you mess with catalogs manually only the tablesync process should do UPDATEs to that catalog (with the exception of s->r state switch which can happen in apply and has no effect on sync because both states mean that synchronization is done, only makes apply stop tracking the table individually). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
At Wed, 19 Apr 2017 17:43:17 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20170419.174317.114509231.horiguchi.kyot...@lab.ntt.co.jp> > At Wed, 19 Apr 2017 10:33:29 +0200, Petr Jelinek > wrote in > > > > Commit has been moved from after to before of the lock section. > > > This causes potential race condition. (As the same as the > > > potential dead-lock, I'm not sure it can cause realistic problem, > > > though..) Isn't it better to be after the lock section? > > > > > > > We just read catalogs so there should not be locking issues. > > Some other process may modify it then go to there. With a kind of > priority inversion, the process may modify the data on the memory > *before* we modify it. Of course this is rather unrealistic, > though. A bit short. Some other process may modify it *after* we read it then go to there. With a kind of priority inversion, the process may modify the data on the memory *before* we modify it. Of course this is rather unrealistic, though. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
At Wed, 19 Apr 2017 10:33:29 +0200, Petr Jelinek wrote in > > Commit has been moved from after to before of the lock section. > > This causes potential race condition. (As the same as the > > potential dead-lock, I'm not sure it can cause realistic problem, > > though..) Isn't it better to be after the lock section? > > > > We just read catalogs so there should not be locking issues. Some other process may modify it then go to there. With a kind of priority inversion, the process may modify the data on the memory *before* we modify it. Of course this is rather unrealistic, though. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On 19/04/17 10:25, Kyotaro HORIGUCHI wrote: > At Wed, 19 Apr 2017 04:18:18 +0200, Petr Jelinek > wrote in > >> On 18/04/17 19:27, Fujii Masao wrote: >>> On Wed, Apr 19, 2017 at 1:35 AM, Petr Jelinek >>> wrote: Thank you for working on this! On 18/04/17 10:16, Masahiko Sawada wrote: > On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI > wrote: >> 10. >>> >>> SpinLockAcquire(&MyLogicalRepWorker->relmutex); >>> MyLogicalRepWorker->relstate = >>> GetSubscriptionRelState(MyLogicalRepWorker->subid, >>> MyLogicalRepWorker->relid, >>> &MyLogicalRepWorker->relstate_lsn, >>> false); >>> SpinLockRelease(&MyLogicalRepWorker->relmutex); >>> >>> Non-"short-term" function like GetSubscriptionRelState() should not >>> be called while spinlock is being held. >>> >> >> One option is adding new LWLock for the relation state change but >> this >> lock is used at many locations where the operation actually doesn't >> take time. I think that the discussion would be needed to get >> consensus, so patch for it is not attached. > > From the point of view of time, I agree that it doesn't seem to > harm. Bt I thing it as more significant problem that > potentially-throwable function is called within the mutex > region. It potentially causes a kind of dead lock. (That said, > theoretically dosn't occur and I'm not sure what happens by the > dead lock..) > Hmm, I think doing what I attached should be fine here. >>> >>> Thanks for the patch! >>> We don't need to hold spinlock for table read, just for changing the MyLogicalRepWorker->relstate. >>> >>> Yes, but the update of MyLogicalRepWorker->relstate_lsn also should >>> be protected with the spinlock. >>> >> >> Yes, sorry tired/blind, fixed. > > Commit has been moved from after to before of the lock section. > This causes potential race condition. (As the same as the > potential dead-lock, I'm not sure it can cause realistic problem, > though..) Isn't it better to be after the lock section? > We just read catalogs so there should not be locking issues. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
At Wed, 19 Apr 2017 04:18:18 +0200, Petr Jelinek wrote in > On 18/04/17 19:27, Fujii Masao wrote: > > On Wed, Apr 19, 2017 at 1:35 AM, Petr Jelinek > > wrote: > >> Thank you for working on this! > >> > >> On 18/04/17 10:16, Masahiko Sawada wrote: > >>> On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI > >>> wrote: > >> > 10. > > > > SpinLockAcquire(&MyLogicalRepWorker->relmutex); > > MyLogicalRepWorker->relstate = > > GetSubscriptionRelState(MyLogicalRepWorker->subid, > > MyLogicalRepWorker->relid, > > &MyLogicalRepWorker->relstate_lsn, > > false); > > SpinLockRelease(&MyLogicalRepWorker->relmutex); > > > > Non-"short-term" function like GetSubscriptionRelState() should not > > be called while spinlock is being held. > > > > One option is adding new LWLock for the relation state change but > this > lock is used at many locations where the operation actually doesn't > take time. I think that the discussion would be needed to get > consensus, so patch for it is not attached. > >>> > >>> From the point of view of time, I agree that it doesn't seem to > >>> harm. Bt I thing it as more significant problem that > >>> potentially-throwable function is called within the mutex > >>> region. It potentially causes a kind of dead lock. (That said, > >>> theoretically dosn't occur and I'm not sure what happens by the > >>> dead lock..) > >>> > >> > >> Hmm, I think doing what I attached should be fine here. > > > > Thanks for the patch! > > > >> We don't need to > >> hold spinlock for table read, just for changing the > >> MyLogicalRepWorker->relstate. > > > > Yes, but the update of MyLogicalRepWorker->relstate_lsn also should > > be protected with the spinlock. > > > > Yes, sorry tired/blind, fixed. Commit has been moved from after to before of the lock section. This causes potential race condition. (As the same as the potential dead-lock, I'm not sure it can cause realistic problem, though..) Isn't it better to be after the lock section? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On 18/04/17 19:27, Fujii Masao wrote: > On Wed, Apr 19, 2017 at 1:35 AM, Petr Jelinek > wrote: >> Thank you for working on this! >> >> On 18/04/17 10:16, Masahiko Sawada wrote: >>> On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI >>> wrote: >> 10. > > SpinLockAcquire(&MyLogicalRepWorker->relmutex); > MyLogicalRepWorker->relstate = > GetSubscriptionRelState(MyLogicalRepWorker->subid, > MyLogicalRepWorker->relid, > &MyLogicalRepWorker->relstate_lsn, > false); > SpinLockRelease(&MyLogicalRepWorker->relmutex); > > Non-"short-term" function like GetSubscriptionRelState() should not > be called while spinlock is being held. > One option is adding new LWLock for the relation state change but this lock is used at many locations where the operation actually doesn't take time. I think that the discussion would be needed to get consensus, so patch for it is not attached. >>> >>> From the point of view of time, I agree that it doesn't seem to >>> harm. Bt I thing it as more significant problem that >>> potentially-throwable function is called within the mutex >>> region. It potentially causes a kind of dead lock. (That said, >>> theoretically dosn't occur and I'm not sure what happens by the >>> dead lock..) >>> >> >> Hmm, I think doing what I attached should be fine here. > > Thanks for the patch! > >> We don't need to >> hold spinlock for table read, just for changing the >> MyLogicalRepWorker->relstate. > > Yes, but the update of MyLogicalRepWorker->relstate_lsn also should > be protected with the spinlock. > Yes, sorry tired/blind, fixed. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index 108326b..7262709 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -703,19 +703,22 @@ copy_table(Relation rel) char * LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) { - char *slotname; - char *err; + char *slotname; + char *err; + charrelstate; + XLogRecPtr relstate_lsn; /* Check the state of the table synchronization. */ StartTransactionCommand(); + relstate = GetSubscriptionRelState(MyLogicalRepWorker->subid, + MyLogicalRepWorker->relid, + &relstate_lsn, false); + CommitTransactionCommand(); + SpinLockAcquire(&MyLogicalRepWorker->relmutex); - MyLogicalRepWorker->relstate = - GetSubscriptionRelState(MyLogicalRepWorker->subid, - MyLogicalRepWorker->relid, - &MyLogicalRepWorker->relstate_lsn, - false); + MyLogicalRepWorker->relstate = relstate; + MyLogicalRepWorker->relstate_lsn = relstate_lsn; SpinLockRelease(&MyLogicalRepWorker->relmutex); - CommitTransactionCommand(); /* * To build a slot name for the sync work, we are limited to NAMEDATALEN - -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Wed, Apr 19, 2017 at 1:35 AM, Petr Jelinek wrote: > Thank you for working on this! > > On 18/04/17 10:16, Masahiko Sawada wrote: >> On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI >> wrote: >>> >>> 3. ApplyLauncherWakeup() should be static function. >>> >>> Attached 003 patch fixes it (and also fixes #5 review comment). >>> >>> 4. Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's subcommands (e.g., ENABLED one) should wake the launcher up on commit. >>> >>> This is also reported by Horiguchi-san on another thread[1], and patch >>> exists. >>> >>> 5. Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? >>> >>> I also guess it's necessary. This change is included in #3 patch. >> >> IsBackendPid() is not free, I suppose that just ignoring failure >> with ESRCH is enough. > > After thought, since LogicalRepCtx->launcher_pid could be 0 should, we > check if launcher_pid != 0? >>> >>> Yes. For example, code to send a signal to autovacuum launcher >>> looks as the follows. I don't think there's a reason to do >>> different thing here. >>> >>> | avlauncher_needs_signal = false; >>> | if (AutoVacPID != 0) >>> | kill(AutoVacPID, SIGUSR2); >>> >> >> Fixed. > > Yes the IsBackendPid was needed only because we didn't reset > launcher_pid and it was causing issues otherwise obviously (and I didn't > realize we are not resetting it...) > >> >>> >>> 6. Normal statements that the walsender for logical rep runs are logged by log_replication_commands. So if log_statement is also set, those commands are logged twice. >>> >>> Attached 006 patch fixes it by adding "-c log_statement = none" to >>> connection parameter of libpqrcv if logical = true. >> >> The control by log_replication_commands is performed on >> walsender, so this also shuld be done on the same place. Anyway >> log_statement is irrelevant to walsender. > > Patch 006 emits all logs executed by the apply worker as a log of > log_replication_command but perhaps you're right. It would be better > to leave the log of log_statement if the command executed by the apply > worker is a SQL command. Will fix. >>> >>> Here, we can choose whether a SQL command is a part of >>> replication commands or not. The previous fix thought it is and >>> this doesn't. (They are emitted in different forms) Is this ok? >> >> Yes, v2 patch logs other than T_SQLCmd as a replication command, and >> T_SQLCmd is logged later in exec_simple_query. The >> log_replication_command logs only replication commands[1], it doesn't >> mean to log commands executed on replication connection. I think such >> behavior is right. >> >>> I'm not sure why you have moved the location of the code, but if >>> it should show all received commands, it might be better to be >>> placed before CHECK_FOR_INTERRUPTS.. >> >> Hmm, the reason why I've moved it is that we cannot know whether given >> cmd_string is a SQL command or a replication command before parsing. >> > > Well the issue is that then the query is not logged if there was parsing > issue even when logging was specifically requested. I don't know what's > good solution here, maybe teaching exec_simple_query to not log the > query if it came from walsender. > > BTW reading that function in walsender, there is : >> /* >>* CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot until the >> next >>* command arrives. Clean up the old stuff if there's anything. >>*/ >> SnapBuildClearExportedSnapshot(); > > and then > >> /* >>* CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot. If it was >>* called outside of transaction the snapshot should be cleared here. >>*/ >> if (!IsTransactionBlock()) >> SnapBuildClearExportedSnapshot(); > > The first one should not be there, it looks like a result of some merge > conflict being solved wrongly. Maybe your patch could fix that too? > > >>> 10. SpinLockAcquire(&MyLogicalRepWorker->relmutex); MyLogicalRepWorker->relstate = GetSubscriptionRelState(MyLogicalRepWorker->subid, MyLogicalRepWorker->relid, &MyLogicalRepWorker->relstate_lsn, false); SpinLockRelease(&MyLogicalRepWorker->relmutex); Non-"short-term" function like GetSubscriptionRelState() should not be called while spinlock is being held. >>> >>> One option is adding new LWLock for the relation state change but this >>> lock is used at many locations where the operation actually doesn't >>> take time. I think that the discussion would be needed to get >>> consensus, so patch for it is not attached. >> >> From the
Re: [HACKERS] some review comments on logical rep code
Thank you for working on this! On 18/04/17 10:16, Masahiko Sawada wrote: > On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI > wrote: >> >> 3. >>> >>> ApplyLauncherWakeup() should be static function. >> >> Attached 003 patch fixes it (and also fixes #5 review comment). >> >> 4. >>> >>> Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's >>> subcommands (e.g., ENABLED one) should wake the launcher up on commit. >> >> This is also reported by Horiguchi-san on another thread[1], and patch >> exists. >> >> 5. >>> >>> Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? >> >> I also guess it's necessary. This change is included in #3 patch. > > IsBackendPid() is not free, I suppose that just ignoring failure > with ESRCH is enough. After thought, since LogicalRepCtx->launcher_pid could be 0 should, we check if launcher_pid != 0? >> >> Yes. For example, code to send a signal to autovacuum launcher >> looks as the follows. I don't think there's a reason to do >> different thing here. >> >> | avlauncher_needs_signal = false; >> | if (AutoVacPID != 0) >> | kill(AutoVacPID, SIGUSR2); >> > > Fixed. Yes the IsBackendPid was needed only because we didn't reset launcher_pid and it was causing issues otherwise obviously (and I didn't realize we are not resetting it...) > >> >> 6. >>> >>> Normal statements that the walsender for logical rep runs are logged >>> by log_replication_commands. So if log_statement is also set, >>> those commands are logged twice. >> >> Attached 006 patch fixes it by adding "-c log_statement = none" to >> connection parameter of libpqrcv if logical = true. > > The control by log_replication_commands is performed on > walsender, so this also shuld be done on the same place. Anyway > log_statement is irrelevant to walsender. Patch 006 emits all logs executed by the apply worker as a log of log_replication_command but perhaps you're right. It would be better to leave the log of log_statement if the command executed by the apply worker is a SQL command. Will fix. >> >> Here, we can choose whether a SQL command is a part of >> replication commands or not. The previous fix thought it is and >> this doesn't. (They are emitted in different forms) Is this ok? > > Yes, v2 patch logs other than T_SQLCmd as a replication command, and > T_SQLCmd is logged later in exec_simple_query. The > log_replication_command logs only replication commands[1], it doesn't > mean to log commands executed on replication connection. I think such > behavior is right. > >> I'm not sure why you have moved the location of the code, but if >> it should show all received commands, it might be better to be >> placed before CHECK_FOR_INTERRUPTS.. > > Hmm, the reason why I've moved it is that we cannot know whether given > cmd_string is a SQL command or a replication command before parsing. > Well the issue is that then the query is not logged if there was parsing issue even when logging was specifically requested. I don't know what's good solution here, maybe teaching exec_simple_query to not log the query if it came from walsender. BTW reading that function in walsender, there is : > /* >* CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot until the next >* command arrives. Clean up the old stuff if there's anything. >*/ > SnapBuildClearExportedSnapshot(); and then > /* >* CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot. If it was >* called outside of transaction the snapshot should be cleared here. >*/ > if (!IsTransactionBlock()) > SnapBuildClearExportedSnapshot(); The first one should not be there, it looks like a result of some merge conflict being solved wrongly. Maybe your patch could fix that too? >> 10. >>> >>> SpinLockAcquire(&MyLogicalRepWorker->relmutex); >>> MyLogicalRepWorker->relstate = >>> GetSubscriptionRelState(MyLogicalRepWorker->subid, >>> MyLogicalRepWorker->relid, >>> &MyLogicalRepWorker->relstate_lsn, >>> false); >>> SpinLockRelease(&MyLogicalRepWorker->relmutex); >>> >>> Non-"short-term" function like GetSubscriptionRelState() should not >>> be called while spinlock is being held. >>> >> >> One option is adding new LWLock for the relation state change but this >> lock is used at many locations where the operation actually doesn't >> take time. I think that the discussion would be needed to get >> consensus, so patch for it is not attached. > > From the point of view of time, I agree that it doesn't seem to > harm. Bt I thing it as more significant problem that > potentially-throwable function is called within the mutex > region. It
Re: [HACKERS] some review comments on logical rep code
On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI wrote: > Hi, > > Thank you for the revised version. > > At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada > wrote in >> On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada >> wrote: >> > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI >> > wrote: >> >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada >> >> wrote in >> >> >> >>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao >> >>> wrote: >> >>> 1. >> >>> > >> >>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer >> >>> > value from the argument, instead of DatumGetObjectId(). >> >>> >> >>> Attached 001 patch fixes it. >> >> >> >> Hmm. I looked at the launcher side and found that it assigns bare >> >> integer to bgw_main_arg. It also should use Int32GetDatum. >> > >> > Yeah, I agreed. Will fix it. > > Thanks. > >> >>> 2. >> >>> > >> >>> > No one resets on_commit_launcher_wakeup flag to false. >> >>> >> >>> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke >> >>> up the launcher process. >> >> >> >> It is omitting the abort case. Maybe it should be >> >> AtEOXact_ApplyLauncher(bool commit)? >> > >> > Should we wake up the launcher process even when abort? > > No, I meant that on_commit_launcher_wakeup should be just reset > without waking up launcher on abort. I understood. Sounds reasonable. > >> >>> 3. >> >>> > >> >>> > ApplyLauncherWakeup() should be static function. >> >>> >> >>> Attached 003 patch fixes it (and also fixes #5 review comment). >> >>> >> >>> 4. >> >>> > >> >>> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's >> >>> > subcommands (e.g., ENABLED one) should wake the launcher up on commit. >> >>> >> >>> This is also reported by Horiguchi-san on another thread[1], and patch >> >>> exists. >> >>> >> >>> 5. >> >>> > >> >>> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? >> >>> >> >>> I also guess it's necessary. This change is included in #3 patch. >> >> >> >> IsBackendPid() is not free, I suppose that just ignoring failure >> >> with ESRCH is enough. >> > >> > After thought, since LogicalRepCtx->launcher_pid could be 0 should, we >> > check if launcher_pid != 0? > > Yes. For example, code to send a signal to autovacuum launcher > looks as the follows. I don't think there's a reason to do > different thing here. > > | avlauncher_needs_signal = false; > | if (AutoVacPID != 0) > | kill(AutoVacPID, SIGUSR2); > Fixed. > >> >>> 6. >> >>> > >> >>> > Normal statements that the walsender for logical rep runs are logged >> >>> > by log_replication_commands. So if log_statement is also set, >> >>> > those commands are logged twice. >> >>> >> >>> Attached 006 patch fixes it by adding "-c log_statement = none" to >> >>> connection parameter of libpqrcv if logical = true. >> >> >> >> The control by log_replication_commands is performed on >> >> walsender, so this also shuld be done on the same place. Anyway >> >> log_statement is irrelevant to walsender. >> > >> > Patch 006 emits all logs executed by the apply worker as a log of >> > log_replication_command but perhaps you're right. It would be better >> > to leave the log of log_statement if the command executed by the apply >> > worker is a SQL command. Will fix. > > Here, we can choose whether a SQL command is a part of > replication commands or not. The previous fix thought it is and > this doesn't. (They are emitted in different forms) Is this ok? Yes, v2 patch logs other than T_SQLCmd as a replication command, and T_SQLCmd is logged later in exec_simple_query. The log_replication_command logs only replication commands[1], it doesn't mean to log commands executed on replication connection. I think such behavior is right. > I'm not sure why you have moved the location of the code, but if > it should show all received commands, it might be better to be > placed before CHECK_FOR_INTERRUPTS.. Hmm, the reason why I've moved it is that we cannot know whether given cmd_string is a SQL command or a replication command before parsing. > > The comment for the code seems should be more clearly. > > - * compatibility. To prevent the command is logged doubly, we doesn't > - * log it when the command is a SQL command. > + * compatibility. SQL command are logged later according > + * to log_statement setting. Fixed. >> >>> 7. >> >>> > >> >>> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits >> >>> > an error. The callback function to reset it needs to be registered >> >>> > via on_shmem_exit(). >> >>> >> >>> Attached 007 patch adds callback function to reset >> >>> LogicalRepCtx->launcher_pid. >> >>> >> >>> 8. >> >>> > >> >>> > Typo: "an subscriber" should be "a subscriber" in some places. >> >>> >> >>> It seems that there is no longer these typo. >> >>> >> >>> 9. >> >>> > /* Get conninfo */ >> >>> > datum = SysCacheGetAttr(SUBSCRIPTIONOID, >> >>> > tup, >> >>> > Anum_pg_subscription_subconninfo, >>
Re: [HACKERS] some review comments on logical rep code
Hi, Thank you for the revised version. At Mon, 17 Apr 2017 23:29:28 +0900, Masahiko Sawada wrote in > On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada > wrote: > > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI > > wrote: > >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada > >> wrote in > >> > >>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao > >>> wrote: > >>> 1. > >>> > > >>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer > >>> > value from the argument, instead of DatumGetObjectId(). > >>> > >>> Attached 001 patch fixes it. > >> > >> Hmm. I looked at the launcher side and found that it assigns bare > >> integer to bgw_main_arg. It also should use Int32GetDatum. > > > > Yeah, I agreed. Will fix it. Thanks. > >>> 2. > >>> > > >>> > No one resets on_commit_launcher_wakeup flag to false. > >>> > >>> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke > >>> up the launcher process. > >> > >> It is omitting the abort case. Maybe it should be > >> AtEOXact_ApplyLauncher(bool commit)? > > > > Should we wake up the launcher process even when abort? No, I meant that on_commit_launcher_wakeup should be just reset without waking up launcher on abort. > >>> 3. > >>> > > >>> > ApplyLauncherWakeup() should be static function. > >>> > >>> Attached 003 patch fixes it (and also fixes #5 review comment). > >>> > >>> 4. > >>> > > >>> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's > >>> > subcommands (e.g., ENABLED one) should wake the launcher up on commit. > >>> > >>> This is also reported by Horiguchi-san on another thread[1], and patch > >>> exists. > >>> > >>> 5. > >>> > > >>> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? > >>> > >>> I also guess it's necessary. This change is included in #3 patch. > >> > >> IsBackendPid() is not free, I suppose that just ignoring failure > >> with ESRCH is enough. > > > > After thought, since LogicalRepCtx->launcher_pid could be 0 should, we > > check if launcher_pid != 0? Yes. For example, code to send a signal to autovacuum launcher looks as the follows. I don't think there's a reason to do different thing here. | avlauncher_needs_signal = false; | if (AutoVacPID != 0) | kill(AutoVacPID, SIGUSR2); > >>> 6. > >>> > > >>> > Normal statements that the walsender for logical rep runs are logged > >>> > by log_replication_commands. So if log_statement is also set, > >>> > those commands are logged twice. > >>> > >>> Attached 006 patch fixes it by adding "-c log_statement = none" to > >>> connection parameter of libpqrcv if logical = true. > >> > >> The control by log_replication_commands is performed on > >> walsender, so this also shuld be done on the same place. Anyway > >> log_statement is irrelevant to walsender. > > > > Patch 006 emits all logs executed by the apply worker as a log of > > log_replication_command but perhaps you're right. It would be better > > to leave the log of log_statement if the command executed by the apply > > worker is a SQL command. Will fix. Here, we can choose whether a SQL command is a part of replication commands or not. The previous fix thought it is and this doesn't. (They are emitted in different forms) Is this ok? I'm not sure why you have moved the location of the code, but if it should show all received commands, it might be better to be placed before CHECK_FOR_INTERRUPTS.. The comment for the code seems should be more clearly. - * compatibility. To prevent the command is logged doubly, we doesn't - * log it when the command is a SQL command. + * compatibility. SQL command are logged later according + * to log_statement setting. > >>> 7. > >>> > > >>> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits > >>> > an error. The callback function to reset it needs to be registered > >>> > via on_shmem_exit(). > >>> > >>> Attached 007 patch adds callback function to reset > >>> LogicalRepCtx->launcher_pid. > >>> > >>> 8. > >>> > > >>> > Typo: "an subscriber" should be "a subscriber" in some places. > >>> > >>> It seems that there is no longer these typo. > >>> > >>> 9. > >>> > /* Get conninfo */ > >>> > datum = SysCacheGetAttr(SUBSCRIPTIONOID, > >>> > tup, > >>> > Anum_pg_subscription_subconninfo, > >>> > &isnull); > >>> > Assert(!isnull); > >>> > > >>> > This assertion makes me think that pg_subscription.subconnifo should > >>> > have NOT NULL constraint. Also subslotname and subpublications > >>> > should have NOT NULL constraint because of the same reason. > >>> > >>> Agreed. Attached 009 patch add NOT NULL constraint to all other > >>> columns of pg_subscription. pg_subscription.subsynccommit should have > >>> it as well. > >> > >> I'm not sure the policy here, but I suppose that the assertions > >> there are still required irrelevantly from the nun-nullness of > >> the attribute. > > > > You're right. Will fix it. > > > >>> 10. > >>> > > >>> > SpinLockAc
Re: [HACKERS] some review comments on logical rep code
On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada wrote: > On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI > wrote: >> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada >> wrote in >>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao wrote: >>> > Hi, >>> > >>> > Though I've read only a part of the logical rep code yet, I'd like to >>> > share some (relatively minor) review comments that I got so far. >>> >>> It seems nobody is working on dealing with these review comments, so >>> I've attached patches. Since there are lots of review comment I >>> numbered each review comment. The prefix of patches I attached is >>> corresponding to review comment number. >>> > > Thank you for reviewing. > >>> 1. >>> > >>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer >>> > value from the argument, instead of DatumGetObjectId(). >>> >>> Attached 001 patch fixes it. >> >> Hmm. I looked at the launcher side and found that it assigns bare >> integer to bgw_main_arg. It also should use Int32GetDatum. > > Yeah, I agreed. Will fix it. > >> >>> logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid >>> userid, >>> Oid relid) >>> { >>> int slot; >> ... >>> for (slot = 0; slot < max_logical_replication_workers; slot++) >> ... >>> bgw.bgw_main_arg = slot; >> >> >> >>> 2. >>> > >>> > No one resets on_commit_launcher_wakeup flag to false. >>> >>> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke >>> up the launcher process. >> >> It is omitting the abort case. Maybe it should be >> AtEOXact_ApplyLauncher(bool commit)? > > Should we wake up the launcher process even when abort? > >> >>> 3. >>> > >>> > ApplyLauncherWakeup() should be static function. >>> >>> Attached 003 patch fixes it (and also fixes #5 review comment). >>> >>> 4. >>> > >>> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's >>> > subcommands (e.g., ENABLED one) should wake the launcher up on commit. >>> >>> This is also reported by Horiguchi-san on another thread[1], and patch >>> exists. >>> >>> 5. >>> > >>> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? >>> >>> I also guess it's necessary. This change is included in #3 patch. >> >> IsBackendPid() is not free, I suppose that just ignoring failure >> with ESRCH is enough. > > After thought, since LogicalRepCtx->launcher_pid could be 0 should, we > check if launcher_pid != 0? > >> >>> 6. >>> > >>> > Normal statements that the walsender for logical rep runs are logged >>> > by log_replication_commands. So if log_statement is also set, >>> > those commands are logged twice. >>> >>> Attached 006 patch fixes it by adding "-c log_statement = none" to >>> connection parameter of libpqrcv if logical = true. >> >> The control by log_replication_commands is performed on >> walsender, so this also shuld be done on the same place. Anyway >> log_statement is irrelevant to walsender. > > Patch 006 emits all logs executed by the apply worker as a log of > log_replication_command but perhaps you're right. It would be better > to leave the log of log_statement if the command executed by the apply > worker is a SQL command. Will fix. > >> >>> 7. >>> > >>> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits >>> > an error. The callback function to reset it needs to be registered >>> > via on_shmem_exit(). >>> >>> Attached 007 patch adds callback function to reset >>> LogicalRepCtx->launcher_pid. >>> >>> 8. >>> > >>> > Typo: "an subscriber" should be "a subscriber" in some places. >>> >>> It seems that there is no longer these typo. >>> >>> 9. >>> > /* Get conninfo */ >>> > datum = SysCacheGetAttr(SUBSCRIPTIONOID, >>> > tup, >>> > Anum_pg_subscription_subconninfo, >>> > &isnull); >>> > Assert(!isnull); >>> > >>> > This assertion makes me think that pg_subscription.subconnifo should >>> > have NOT NULL constraint. Also subslotname and subpublications >>> > should have NOT NULL constraint because of the same reason. >>> >>> Agreed. Attached 009 patch add NOT NULL constraint to all other >>> columns of pg_subscription. pg_subscription.subsynccommit should have >>> it as well. >> >> I'm not sure the policy here, but I suppose that the assertions >> there are still required irrelevantly from the nun-nullness of >> the attribute. > > You're right. Will fix it. > >>> 10. >>> > >>> > SpinLockAcquire(&MyLogicalRepWorker->relmutex); >>> > MyLogicalRepWorker->relstate = >>> > GetSubscriptionRelState(MyLogicalRepWorker->subid, >>> > MyLogicalRepWorker->relid, >>> > &MyLogicalRepWorker->relstate_lsn, >>> > false); >>> > SpinLockRelease(&MyLogicalRepWorker->relmutex); >>> > >>> > Non-"short-term" function like GetSubscriptionRelState() should not >>> > be called while spinlock is being held. >>> > >>> >>> One option is adding new LWLock for the relation state change but this >>> lock is used at many locations
Re: [HACKERS] some review comments on logical rep code
On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI wrote: > At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada > wrote in >> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao wrote: >> > Hi, >> > >> > Though I've read only a part of the logical rep code yet, I'd like to >> > share some (relatively minor) review comments that I got so far. >> >> It seems nobody is working on dealing with these review comments, so >> I've attached patches. Since there are lots of review comment I >> numbered each review comment. The prefix of patches I attached is >> corresponding to review comment number. >> Thank you for reviewing. >> 1. >> > >> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer >> > value from the argument, instead of DatumGetObjectId(). >> >> Attached 001 patch fixes it. > > Hmm. I looked at the launcher side and found that it assigns bare > integer to bgw_main_arg. It also should use Int32GetDatum. Yeah, I agreed. Will fix it. > >> logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid >> userid, >> Oid relid) >> { >> int slot; > ... >> for (slot = 0; slot < max_logical_replication_workers; slot++) > ... >> bgw.bgw_main_arg = slot; > > > >> 2. >> > >> > No one resets on_commit_launcher_wakeup flag to false. >> >> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke >> up the launcher process. > > It is omitting the abort case. Maybe it should be > AtEOXact_ApplyLauncher(bool commit)? Should we wake up the launcher process even when abort? > >> 3. >> > >> > ApplyLauncherWakeup() should be static function. >> >> Attached 003 patch fixes it (and also fixes #5 review comment). >> >> 4. >> > >> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's >> > subcommands (e.g., ENABLED one) should wake the launcher up on commit. >> >> This is also reported by Horiguchi-san on another thread[1], and patch >> exists. >> >> 5. >> > >> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? >> >> I also guess it's necessary. This change is included in #3 patch. > > IsBackendPid() is not free, I suppose that just ignoring failure > with ESRCH is enough. After thought, since LogicalRepCtx->launcher_pid could be 0 should, we check if launcher_pid != 0? > >> 6. >> > >> > Normal statements that the walsender for logical rep runs are logged >> > by log_replication_commands. So if log_statement is also set, >> > those commands are logged twice. >> >> Attached 006 patch fixes it by adding "-c log_statement = none" to >> connection parameter of libpqrcv if logical = true. > > The control by log_replication_commands is performed on > walsender, so this also shuld be done on the same place. Anyway > log_statement is irrelevant to walsender. Patch 006 emits all logs executed by the apply worker as a log of log_replication_command but perhaps you're right. It would be better to leave the log of log_statement if the command executed by the apply worker is a SQL command. Will fix. > >> 7. >> > >> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits >> > an error. The callback function to reset it needs to be registered >> > via on_shmem_exit(). >> >> Attached 007 patch adds callback function to reset >> LogicalRepCtx->launcher_pid. >> >> 8. >> > >> > Typo: "an subscriber" should be "a subscriber" in some places. >> >> It seems that there is no longer these typo. >> >> 9. >> > /* Get conninfo */ >> > datum = SysCacheGetAttr(SUBSCRIPTIONOID, >> > tup, >> > Anum_pg_subscription_subconninfo, >> > &isnull); >> > Assert(!isnull); >> > >> > This assertion makes me think that pg_subscription.subconnifo should >> > have NOT NULL constraint. Also subslotname and subpublications >> > should have NOT NULL constraint because of the same reason. >> >> Agreed. Attached 009 patch add NOT NULL constraint to all other >> columns of pg_subscription. pg_subscription.subsynccommit should have >> it as well. > > I'm not sure the policy here, but I suppose that the assertions > there are still required irrelevantly from the nun-nullness of > the attribute. You're right. Will fix it. >> 10. >> > >> > SpinLockAcquire(&MyLogicalRepWorker->relmutex); >> > MyLogicalRepWorker->relstate = >> > GetSubscriptionRelState(MyLogicalRepWorker->subid, >> > MyLogicalRepWorker->relid, >> > &MyLogicalRepWorker->relstate_lsn, >> > false); >> > SpinLockRelease(&MyLogicalRepWorker->relmutex); >> > >> > Non-"short-term" function like GetSubscriptionRelState() should not >> > be called while spinlock is being held. >> > >> >> One option is adding new LWLock for the relation state change but this >> lock is used at many locations where the operation actually doesn't >> take time. I think that the discussion would be needed to get >> consensus, so patch for it is not attached. > > From the point of view of time, I agree that it doesn't seem to > harm
Re: [HACKERS] some review comments on logical rep code
At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada wrote in > On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao wrote: > > Hi, > > > > Though I've read only a part of the logical rep code yet, I'd like to > > share some (relatively minor) review comments that I got so far. > > It seems nobody is working on dealing with these review comments, so > I've attached patches. Since there are lots of review comment I > numbered each review comment. The prefix of patches I attached is > corresponding to review comment number. > > 1. > > > > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer > > value from the argument, instead of DatumGetObjectId(). > > Attached 001 patch fixes it. Hmm. I looked at the launcher side and found that it assigns bare integer to bgw_main_arg. It also should use Int32GetDatum. > logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid userid, > Oid relid) > { > int slot; ... > for (slot = 0; slot < max_logical_replication_workers; slot++) ... > bgw.bgw_main_arg = slot; > 2. > > > > No one resets on_commit_launcher_wakeup flag to false. > > Attached 002 patch makes on_commit_launcher_wakeup turn off after woke > up the launcher process. It is omitting the abort case. Maybe it should be AtEOXact_ApplyLauncher(bool commit)? > 3. > > > > ApplyLauncherWakeup() should be static function. > > Attached 003 patch fixes it (and also fixes #5 review comment). > > 4. > > > > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's > > subcommands (e.g., ENABLED one) should wake the launcher up on commit. > > This is also reported by Horiguchi-san on another thread[1], and patch exists. > > 5. > > > > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? > > I also guess it's necessary. This change is included in #3 patch. IsBackendPid() is not free, I suppose that just ignoring failure with ESRCH is enough. > 6. > > > > Normal statements that the walsender for logical rep runs are logged > > by log_replication_commands. So if log_statement is also set, > > those commands are logged twice. > > Attached 006 patch fixes it by adding "-c log_statement = none" to > connection parameter of libpqrcv if logical = true. The control by log_replication_commands is performed on walsender, so this also shuld be done on the same place. Anyway log_statement is irrelevant to walsender. > 7. > > > > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits > > an error. The callback function to reset it needs to be registered > > via on_shmem_exit(). > > Attached 007 patch adds callback function to reset > LogicalRepCtx->launcher_pid. > > 8. > > > > Typo: "an subscriber" should be "a subscriber" in some places. > > It seems that there is no longer these typo. > > 9. > > /* Get conninfo */ > > datum = SysCacheGetAttr(SUBSCRIPTIONOID, > > tup, > > Anum_pg_subscription_subconninfo, > > &isnull); > > Assert(!isnull); > > > > This assertion makes me think that pg_subscription.subconnifo should > > have NOT NULL constraint. Also subslotname and subpublications > > should have NOT NULL constraint because of the same reason. > > Agreed. Attached 009 patch add NOT NULL constraint to all other > columns of pg_subscription. pg_subscription.subsynccommit should have > it as well. I'm not sure the policy here, but I suppose that the assertions there are still required irrelevantly from the nun-nullness of the attribute. > 10. > > > > SpinLockAcquire(&MyLogicalRepWorker->relmutex); > > MyLogicalRepWorker->relstate = > > GetSubscriptionRelState(MyLogicalRepWorker->subid, > > MyLogicalRepWorker->relid, > > &MyLogicalRepWorker->relstate_lsn, > > false); > > SpinLockRelease(&MyLogicalRepWorker->relmutex); > > > > Non-"short-term" function like GetSubscriptionRelState() should not > > be called while spinlock is being held. > > > > One option is adding new LWLock for the relation state change but this > lock is used at many locations where the operation actually doesn't > take time. I think that the discussion would be needed to get > consensus, so patch for it is not attached. >From the point of view of time, I agree that it doesn't seem to harm. Bt I thing it as more significant problem that potentially-throwable function is called within the mutex region. It potentially causes a kind of dead lock. (That said, theoretically dosn't occur and I'm not sure what happens by the dead lock..) > [1] > https://www.postgresql.org/message-id/20170406.212441.177025736.horiguchi.kyot...@lab.ntt.co.jp -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao wrote: > Hi, > > Though I've read only a part of the logical rep code yet, I'd like to > share some (relatively minor) review comments that I got so far. It seems nobody is working on dealing with these review comments, so I've attached patches. Since there are lots of review comment I numbered each review comment. The prefix of patches I attached is corresponding to review comment number. 1. > > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer > value from the argument, instead of DatumGetObjectId(). Attached 001 patch fixes it. 2. > > No one resets on_commit_launcher_wakeup flag to false. Attached 002 patch makes on_commit_launcher_wakeup turn off after woke up the launcher process. 3. > > ApplyLauncherWakeup() should be static function. Attached 003 patch fixes it (and also fixes #5 review comment). 4. > > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's > subcommands (e.g., ENABLED one) should wake the launcher up on commit. This is also reported by Horiguchi-san on another thread[1], and patch exists. 5. > > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? I also guess it's necessary. This change is included in #3 patch. 6. > > Normal statements that the walsender for logical rep runs are logged > by log_replication_commands. So if log_statement is also set, > those commands are logged twice. Attached 006 patch fixes it by adding "-c log_statement = none" to connection parameter of libpqrcv if logical = true. 7. > > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits > an error. The callback function to reset it needs to be registered > via on_shmem_exit(). Attached 007 patch adds callback function to reset LogicalRepCtx->launcher_pid. 8. > > Typo: "an subscriber" should be "a subscriber" in some places. It seems that there is no longer these typo. 9. > /* Get conninfo */ > datum = SysCacheGetAttr(SUBSCRIPTIONOID, > tup, > Anum_pg_subscription_subconninfo, > &isnull); > Assert(!isnull); > > This assertion makes me think that pg_subscription.subconnifo should > have NOT NULL constraint. Also subslotname and subpublications > should have NOT NULL constraint because of the same reason. Agreed. Attached 009 patch add NOT NULL constraint to all other columns of pg_subscription. pg_subscription.subsynccommit should have it as well. 10. > > SpinLockAcquire(&MyLogicalRepWorker->relmutex); > MyLogicalRepWorker->relstate = > GetSubscriptionRelState(MyLogicalRepWorker->subid, > MyLogicalRepWorker->relid, > &MyLogicalRepWorker->relstate_lsn, > false); > SpinLockRelease(&MyLogicalRepWorker->relmutex); > > Non-"short-term" function like GetSubscriptionRelState() should not > be called while spinlock is being held. > One option is adding new LWLock for the relation state change but this lock is used at many locations where the operation actually doesn't take time. I think that the discussion would be needed to get consensus, so patch for it is not attached. [1] https://www.postgresql.org/message-id/20170406.212441.177025736.horiguchi.kyot...@lab.ntt.co.jp Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center 001_change_DatumGetObjectId_to_DatumGetInt32.patch Description: Binary data 002_Reset_on_commit_launcher_wakeup.patch Description: Binary data 003_Fix_ApplyLancherWakeUp_function.patch Description: Binary data 006_Prevent_to_emit_statement_log_double.patch Description: Binary data 007_Regiter_on_shmem_exit_for_launcher.patch Description: Binary data 009_Add_not_null_constraint.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] some review comments on logical rep code
On Fri, Apr 14, 2017 at 04:47:12AM +0900, Fujii Masao wrote: > Though I've read only a part of the logical rep code yet, I'd like to > share some (relatively minor) review comments that I got so far. > > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer > value from the argument, instead of DatumGetObjectId(). > > No one resets on_commit_launcher_wakeup flag to false. > > ApplyLauncherWakeup() should be static function. > > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's > subcommands (e.g., ENABLED one) should wake the launcher up on commit. > > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? > > Normal statements that the walsender for logical rep runs are logged > by log_replication_commands. So if log_statement is also set, > those commands are logged twice. > > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits > an error. The callback function to reset it needs to be registered > via on_shmem_exit(). > > Typo: "an subscriber" should be "a subscriber" in some places. > > /* Get conninfo */ > datum = SysCacheGetAttr(SUBSCRIPTIONOID, > tup, > Anum_pg_subscription_subconninfo, > &isnull); > Assert(!isnull); > > This assertion makes me think that pg_subscription.subconnifo should > have NOT NULL constraint. Also subslotname and subpublications > should have NOT NULL constraint because of the same reason. > > SpinLockAcquire(&MyLogicalRepWorker->relmutex); > MyLogicalRepWorker->relstate = > GetSubscriptionRelState(MyLogicalRepWorker->subid, > MyLogicalRepWorker->relid, > &MyLogicalRepWorker->relstate_lsn, > false); > SpinLockRelease(&MyLogicalRepWorker->relmutex); > > Non-"short-term" function like GetSubscriptionRelState() should not > be called while spinlock is being held. [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Peter, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] some review comments on logical rep code
Hi, Though I've read only a part of the logical rep code yet, I'd like to share some (relatively minor) review comments that I got so far. In ApplyWorkerMain(), DatumGetInt32() should be used to get integer value from the argument, instead of DatumGetObjectId(). No one resets on_commit_launcher_wakeup flag to false. ApplyLauncherWakeup() should be static function. Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's subcommands (e.g., ENABLED one) should wake the launcher up on commit. Why is IsBackendPid() check necessary in ApplyLauncherWakeup()? Normal statements that the walsender for logical rep runs are logged by log_replication_commands. So if log_statement is also set, those commands are logged twice. LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits an error. The callback function to reset it needs to be registered via on_shmem_exit(). Typo: "an subscriber" should be "a subscriber" in some places. /* Get conninfo */ datum = SysCacheGetAttr(SUBSCRIPTIONOID, tup, Anum_pg_subscription_subconninfo, &isnull); Assert(!isnull); This assertion makes me think that pg_subscription.subconnifo should have NOT NULL constraint. Also subslotname and subpublications should have NOT NULL constraint because of the same reason. SpinLockAcquire(&MyLogicalRepWorker->relmutex); MyLogicalRepWorker->relstate = GetSubscriptionRelState(MyLogicalRepWorker->subid, MyLogicalRepWorker->relid, &MyLogicalRepWorker->relstate_lsn, false); SpinLockRelease(&MyLogicalRepWorker->relmutex); Non-"short-term" function like GetSubscriptionRelState() should not be called while spinlock is being held. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers