Hello hackers, After fixing bug #16161 (pg_ctl inability to open just deleted postmaster.pid) there are still some errors related to the same Windows-only issue. Namely, pg_upgradeCheck failures seen on https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=fairywren&br=REL_13_STABLE Here pg_ls_dir_files couldn't handle the "delete pending" state too.
I've made a simple test to reproduce this behavior with pg_ls_waldir, that fails with: error running SQL: 'psql:<stdin>:1: ERROR: could not stat file "pg_wal/dummy": Permission denied' As noted in [1], a sensible solution would be putting the same "retry on ERROR_ACCESS_DENIED" action in a wrapper for stat(). And bed90759f brought in master the _pgstat64() function, where such error handling should be placed. So please look at the patch (based on the previous one made to fix bug#16161), that makes the attached test pass. And I have a few questions. For now genfile.c needs to include "port.h" explicitly to override definitions in "sys/stat.h", that override "stat" redefinition for Windows included implicitly above via "postgres.h". Without it the pg_ls_dir_files() function just uses system' stat(). So I wonder, is there a way to make all stat-calling code use the improved stat()? And if so, maybe the stat() call in pgwin32_open should be replaced with microsoft_native_stat(). And regarding released versions, are there any reasons to not backport bed90759f (with the proposed fix or alike)? [1] https://www.postgresql.org/message-id/396837.1605298573%40sss.pgh.pa.us Best regards, Alexander
001_delete_pending.pl
Description: Perl program
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c index d34182a7b0..922df49125 100644 --- a/src/backend/utils/adt/genfile.c +++ b/src/backend/utils/adt/genfile.c @@ -28,6 +28,7 @@ #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "port.h" #include "postmaster/syslogger.h" #include "storage/fd.h" #include "utils/acl.h" diff --git a/src/port/win32stat.c b/src/port/win32stat.c index 4351aa4d08..c2b55c7fa6 100644 --- a/src/port/win32stat.c +++ b/src/port/win32stat.c @@ -170,6 +170,7 @@ _pgstat64(const char *name, struct stat *buf) SECURITY_ATTRIBUTES sa; HANDLE hFile; int ret; + int loops = 0; #if _WIN32_WINNT < 0x0600 IO_STATUS_BLOCK ioStatus; FILE_STANDARD_INFORMATION standardInfo; @@ -182,31 +183,60 @@ _pgstat64(const char *name, struct stat *buf) errno = EINVAL; return -1; } - /* fast not-exists check */ if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES) { - _dosmaperr(GetLastError()); - return -1; + DWORD err = GetLastError(); + + if (err != ERROR_ACCESS_DENIED) { + _dosmaperr(err); + return -1; + } } /* get a file handle as lightweight as we can */ sa.nLength = sizeof(SECURITY_ATTRIBUTES); sa.bInheritHandle = TRUE; sa.lpSecurityDescriptor = NULL; - hFile = CreateFile(name, - GENERIC_READ, - (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE), - &sa, - OPEN_EXISTING, - (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS | - FILE_FLAG_OVERLAPPED), - NULL); - if (hFile == INVALID_HANDLE_VALUE) + while ((hFile = CreateFile(name, + GENERIC_READ, + (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE), + &sa, + OPEN_EXISTING, + (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS | + FILE_FLAG_OVERLAPPED), + NULL)) == INVALID_HANDLE_VALUE) { DWORD err = GetLastError(); - CloseHandle(hFile); + /* + * ERROR_ACCESS_DENIED is returned if the file is deleted but not yet + * gone (Windows NT status code is STATUS_DELETE_PENDING). In that + * case we want to wait a bit and try again, giving up after 1 second + * (since this condition should never persist very long). However, + * there are other commonly-hit cases that return ERROR_ACCESS_DENIED, + * so care is needed. In particular that happens if we try to open a + * directory, or of course if there's an actual file-permissions + * problem. To distinguish these cases, try a stat(). In the + * delete-pending case, it will either also get STATUS_DELETE_PENDING, + * or it will see the file as gone and fail with ENOENT. In other + * cases it will usually succeed. The only somewhat-likely case where + * this coding will uselessly wait is if there's a permissions problem + * with a containing directory, which we hope will never happen in any + * performance-critical code paths. + */ + if (err == ERROR_ACCESS_DENIED) + { + struct microsoft_native_stat st; + + if (microsoft_native_stat(name, &st) != 0) + { + pg_usleep(100000); + loops++; + continue; + } + } + _dosmaperr(err); return -1; }