On Thu, Sep 13, 2018 at 6:30 PM Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> wrote: > > Hello. > > Thank you for the comments, Sawada-san, Peter. > > At Mon, 10 Sep 2018 19:52:24 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI > <horiguchi.kyot...@lab.ntt.co.jp> wrote in > <20180910.195224.22629595.horiguchi.kyot...@lab.ntt.co.jp> > > At Thu, 6 Sep 2018 22:32:21 +0200, Peter Eisentraut > > <peter.eisentr...@2ndquadrant.com> wrote in > > <29bbd79d-696b-509e-578a-0fc38a3b9...@2ndquadrant.com> > > Thanks for pointing that. That's a major cause of confusion. Does > > the following make sense? > > > > > Specify the maximum size of WAL files that <link > > > linkend="streaming-replication-slots">replication slots</link> > > > are allowed to retain in the <filename>pg_wal</filename> > > > directory at checkpoint time. If > > > <varname>max_slot_wal_keep_size</varname> is zero (the > > > default), replication slots retain unlimited size of WAL files. > > + If restart_lsn of a replication slot gets behind more than that > > + bytes from the current LSN, the standby using the slot may not > > + be able to reconnect due to removal of required WAL records. > ... > > > Also, I don't think 0 is a good value for the default behavior. 0 would > > > mean that a slot is not allowed to retain any more WAL than already > > > exists anyway. Maybe we don't want to support that directly, but it's a > > > valid configuration. So maybe use -1 for infinity. > > > > In realtion to the reply just sent to Sawada-san, remain of a > > slot can be at most 16MB in the 0 case with the default segment > > size. So you're right in this sense. Will fix in the coming > > version. Thanks. > > I did the following thinkgs in the new version. > > - Changed the disable (or infinite) and default value of > max_slot_wal_keep_size to -1 from 0. > (patch 1, 2. guc.c, xlog.c: GetOldestKeepSegment()) > > - Fixed documentation for max_slot_wal_keep_size tomention what > happnes when WAL exceeds the size, and additional rewrites. > (patch 4, catalogs.sgml, config.sgml) > > - Folded parameter list of GetOldestKeepSegment(). > (patch 1, 2. xlog.c) > > - Provided the plural form of errdetail of checkpoint-time > warning. (patch 1, xlog.c: KeepLogSeg()) > > - Some cosmetic change and small refactor. > (patch 1, 2, 3) >
Sorry for the late response. The patch still can be applied to the curent HEAD so I reviewed the latest patch. The value of 'remain' and 'wal_status' might not be correct. Although 'wal_stats' shows 'lost' but we can get changes from the slot. I've tested it with the following steps. =# alter system set max_slot_wal_keep_size to '64MB'; -- while wal_keep_segments is 0 =# select pg_reload_conf(); =# select slot_name, wal_status, remain, pg_size_pretty(remain) as remain_pretty from pg_replication_slots ; slot_name | wal_status | remain | remain_pretty -----------+------------+----------+--------------- 1 | streaming | 83885648 | 80 MB (1 row) ** consume 80MB WAL, and do CHECKPOINT ** =# select slot_name, wal_status, remain, pg_size_pretty(remain) as remain_pretty from pg_replication_slots ; slot_name | wal_status | remain | remain_pretty -----------+------------+--------+--------------- 1 | lost | 0 | 0 bytes (1 row) =# select count(*) from pg_logical_slot_get_changes('1', NULL, NULL); count ------- 15 (1 row) ----- I got the following result with setting of wal_keep_segments > max_slot_keep_size. The 'wal_status' shows 'streaming' although the 'remain' is 0. =# select slot_name, wal_status, remain from pg_replication_slots limit 1; slot_name | wal_status | remain -----------+------------+-------- 1 | streaming | 0 (1 row) + XLByteToSeg(targetLSN, restartSeg, wal_segment_size); + if (max_slot_wal_keep_size_mb >= 0 && currSeg <= restartSeg + limitSegs) + { You use limitSegs here but shouldn't we use keepSeg instead? Actually I've commented this point for v6 patch before[1], and this had been fixed in the v7 patch. However you're using limitSegs again from v8 patch again. I might be missing something though. Changed the status to 'Waiting on Author'. [1] https://www.postgresql.org/message-id/CAD21AoD0rChq7wQE%3D_o95quopcQGjcVG9omwdH07nT5cm81hzg%40mail.gmail.com [2] https://www.postgresql.org/message-id/20180904.195250.144186960.horiguchi.kyotaro%40lab.ntt.co.jp Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATIONNTT Open Source Software Center