Hi, On Fri, Oct 17, 2025 at 4:22 AM John H <[email protected]> wrote:
> Hi, > > On Wed, Oct 15, 2025 at 7:27 AM Srinath Reddy Sadipiralla > <[email protected]> wrote: > > > > > TBH first it seemed to me a good coding practice for future proofing, > > then i checked if there's any place in postgres we are doing > > something similar ,then found 1 in src/include/access/gist.h > > > > typedef struct GISTDeletedPageContents > > { > > /* last xid which could see the page in a scan */ > > FullTransactionId deleteXid; > > } GISTDeletedPageContents; > > > > It makes more sense to me if we expect the struct or same set of arguments > to be reused in different places / want a stable API reference. I don't > think > we want everything to be wrapped around a struct for functions just in > case. > > > .. > > thanks for updating, i have attached a diff patch on > > top of v9-0001 patch , where i tried to add more tests > > ... > > Thank you for the diff! Your changes are a lot cleaner and I've included > it in > the latest patch. One difference I've made is instead of additionally > logging in > decide_wal_file_action > > > + pg_log_debug("WAL segment \"%s\" is copied to target", fname); > > I realized we are already logging the WAL segments that are copied over in > print_filemap so I've updated the test to check for that instead > > > qr/pg_wal\/$corrupt_wal_seg \(COPY\)/ > > I also updated the error message for the last three checks. > Thanks for updating the patch, I have reviewed it, except these below concerns, patch LGTM. 1) regarding the parsing the WAL filename On Fri, Oct 17, 2025 at 4:25 AM John H <[email protected]> wrote: > On Thu, Oct 16, 2025 at 12:00 PM Robert Haas <[email protected]> > wrote: > > > > On Wed, Oct 15, 2025 at 10:27 AM Srinath Reddy Sadipiralla > > > ,the main problem is when if someone manually places an invalid WAL > file > > > in pg_wal like 00000001FFFFFFFFFFFFFF10, IsXLogFileName will > > > consider it as valid ,so with the approach as i mentioned earlier we > can > > > catch such cases. > > > > I think that parsing the file name may be a good idea so that we can > > do appropriate sanity checks on the values (e.g. checking that we're > > only skipping copying prior to last_common_segno), but I do not think > > we should worry too much about the user manually injecting invalid WAL > > files. I mean, I would prefer that if that does happen, it either > > works anyway or fails with a sensible error message, rather than > > emitting an incomprehensible error message or dumping core. But, it is > > in general true that if manual modifications are made to the data > > directory, things may go terribly wrong, and this code is not obliged > > to provide any more protection against such scenarios than we do in > > other cases. Ultimately, such modifications are user error. > > > > It feels like there's a lot of things we could attempt to ensure > "correctness" if we are concerned about scenarios when the user manually > puts > or modifies content unexpectedly in the pg_wal directory. > > For instance, one could make the argument that when considering to skip > copying the common WAL segments, even though they are of the same > size, it's possible the user has manipulated them directly. I don't > think we need to > run checksums on every WAL segment that is a valid candidate to ensure they > match. I agree with Robert here, and i think if we found that the WAL filename is not valid after parsing we can return something like FILE_CONTENT_INVALID_TYPE in getFileContentType so that it won't end up going through decide_wal_file_action,then XLogFromFileName may assign file_segno invalid values greater than last_common_segno which means we lead to copying it,even though it's an invalid file, thoughts? 2) Please fix the indentation by running pgindent, also add file_content_type_t typedef to typedefs.list file. 3) maybe we can improve the error messages for the last 2 checks in tap test, that because of this reason hence proven that the copy from source to target has been done. 4) We need to update the pg_rewind docs regarding this new behaviour. -- Thanks, Srinath Reddy Sadipiralla EDB: https://www.enterprisedb.com/
