On Tue, Nov 15, 2022 at 1:29 AM David Christensen <david.christen...@crunchydata.com> wrote: > > Enclosed is v8, which uses the replication slot method to retain WAL > as well as fsync'ing the output directory when everything is done.
Thanks. It mostly is in good shape. However, few more comments: 1. + if it does not exist. The images saved will be subject to the same + filtering and limiting criteria as display records, but in this + mode <application>pg_waldump</application> will not output any other + information. May I know what's the intention of the statement 'The images saved ....'? If it's not necessary and convey anything useful to the user, can we remove it? 2. +#include "storage/checksum.h" +#include "storage/checksum_impl.h" I think we don't need the above includes as we got rid of verifying page checksums. The patch compiles without them for me. 3. + char *save_fpw_path; Can we rename the above variable to save_fpi_path, just to be in sync with what we expose to the user, the option name 'save-fpi'? 4. + if (config.save_fpw_path != NULL) + { + /* Fsync our output directory */ + fsync_fname(config.save_fpw_path, true); + } I guess adding a comment there as to why we aren't fsyncing for every file that gets created, but once per the directory at the end. That'd help clarify doubts that other members might get while looking at the code. 5. + if (config.save_fpw_path != NULL) + { + /* Fsync our output directory */ + fsync_fname(config.save_fpw_path, true); + } So, are we sure that we don't want to fsync for time_to_stop exit(0) cases, say when CTRL+C'ed. Looks like we handle time_to_stop safely meaning exiting with return code 0, shouldn't we fsync the directory? 6. + else if (config.save_fpw_path) Let's use the same convention to check non-NULLness, config.save_fpw_path != NULL. 7. +CHECKPOINT; +SELECT pg_switch_wal(); +UPDATE test_table SET a = a + 1; +SELECT pg_switch_wal(); I don't think switching WAL after checkpoint is necessary here, because the checkpoint ensures all the WAL gets flushed to disk. Please remove it. PS: I've seen the following code: +my $walfile = [sort { $a <=> $b } glob("$waldir/00*")]->[1]; # we want the second WAL file, which will be a complete WAL file with full-page writes for our specific relation. 8. +$node->safe_psql('postgres', <<EOF); +EOF Why EOF is used here? Can't we do something like below to execute multiple statements? $node->safe_psql( 'postgres', qq[ SELECT data FROM pg_logical_slot_get_changes('regression_slot1', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); SELECT data FROM pg_logical_slot_get_changes('regression_slot2', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); SELECT data FROM pg_logical_slot_get_changes('regression_slot3', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); SELECT data FROM pg_logical_slot_get_changes('regression_slot4', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); ]); Same here: +$node->safe_psql('postgres', <<EOQ); +SELECT pg_drop_replication_slot('regress_pg_waldump_slot'); +EOQ 9. +my $walfile = [sort { $a <=> $b } glob("$waldir/00*")]->[1]; # we want the second WAL file, which will be a complete WAL file with full-page writes for our specific relation. Is it guaranteed that just looking at the second WAL file in the pg_wal directory assures WAL file with FPIs? I think we have to save the WAL file that contains FPIs, that is the file after, CHECKPOINT, UPDATE and pg_switch_wal. I think you can store output LSN of pg_switch_wal 10. +$node->safe_psql('postgres', <<EOQ); +SELECT pg_drop_replication_slot('regress_pg_waldump_slot'); +EOQ +done_testing(); Do we need to explicitly drop the slot here? I think we don't specifically drop the replication slot in all the places, my guess is after done_testing(), the node would get destroyed and also the slot. I think it's not required. 11. +# verify filename formats matches w/--save-fpi +for my $fullpath (glob "$tmp_folder/raw/*") Do we need to look for the exact match of the file that gets created in the save-fpi path? While checking for this is great, it makes the test code non-portable (may not work on Windows or other platforms, no?) and complex? This way, you can get rid of get_block_info() as well? And +for my $fullpath (glob "$tmp_folder/raw/*") will also get simplified. I think you can further simplify the tests by: create the node generate an FPI call pg_waldump with save-fpi option check the target directory for a file that contains the relid, something like '%relid%'. The above would still serve the purpose, tests the code without much complexity. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com