Hello hackers, After fixing bug #16161 (pg_ctl inability to open just deleted postmaster.pid) there are still some errors related to the same Windows-only issue. Namely, pg_upgradeCheck failures seen on https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=fairywren&br=REL_13_STABLE Here pg_ls_dir_files couldn't handle the "delete pending" state too.
I've made a simple test to reproduce this behavior with pg_ls_waldir, that fails with: error running SQL: 'psql:<stdin>:1: ERROR: could not stat file "pg_wal/dummy": Permission denied' As noted in [1], a sensible solution would be putting the same "retry on ERROR_ACCESS_DENIED" action in a wrapper for stat(). And bed90759f brought in master the _pgstat64() function, where such error handling should be placed. So please look at the patch (based on the previous one made to fix bug#16161), that makes the attached test pass. And I have a few questions. For now genfile.c needs to include "port.h" explicitly to override definitions in "sys/stat.h", that override "stat" redefinition for Windows included implicitly above via "postgres.h". Without it the pg_ls_dir_files() function just uses system' stat(). So I wonder, is there a way to make all stat-calling code use the improved stat()? And if so, maybe the stat() call in pgwin32_open should be replaced with microsoft_native_stat(). And regarding released versions, are there any reasons to not backport bed90759f (with the proposed fix or alike)? [1] https://www.postgresql.org/message-id/396837.1605298573%40sss.pgh.pa.us Best regards, Alexander
001_delete_pending.pl
Description: Perl program
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..922df49125 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -28,6 +28,7 @@
#include "funcapi.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
+#include "port.h"
#include "postmaster/syslogger.h"
#include "storage/fd.h"
#include "utils/acl.h"
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 4351aa4d08..c2b55c7fa6 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -170,6 +170,7 @@ _pgstat64(const char *name, struct stat *buf)
SECURITY_ATTRIBUTES sa;
HANDLE hFile;
int ret;
+ int loops = 0;
#if _WIN32_WINNT < 0x0600
IO_STATUS_BLOCK ioStatus;
FILE_STANDARD_INFORMATION standardInfo;
@@ -182,31 +183,60 @@ _pgstat64(const char *name, struct stat *buf)
errno = EINVAL;
return -1;
}
-
/* fast not-exists check */
if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
{
- _dosmaperr(GetLastError());
- return -1;
+ DWORD err = GetLastError();
+
+ if (err != ERROR_ACCESS_DENIED) {
+ _dosmaperr(err);
+ return -1;
+ }
}
/* get a file handle as lightweight as we can */
sa.nLength = sizeof(SECURITY_ATTRIBUTES);
sa.bInheritHandle = TRUE;
sa.lpSecurityDescriptor = NULL;
- hFile = CreateFile(name,
- GENERIC_READ,
- (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
- &sa,
- OPEN_EXISTING,
- (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
- FILE_FLAG_OVERLAPPED),
- NULL);
- if (hFile == INVALID_HANDLE_VALUE)
+ while ((hFile = CreateFile(name,
+ GENERIC_READ,
+ (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
+ &sa,
+ OPEN_EXISTING,
+ (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
+ FILE_FLAG_OVERLAPPED),
+ NULL)) == INVALID_HANDLE_VALUE)
{
DWORD err = GetLastError();
- CloseHandle(hFile);
+ /*
+ * ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
+ * gone (Windows NT status code is STATUS_DELETE_PENDING). In that
+ * case we want to wait a bit and try again, giving up after 1 second
+ * (since this condition should never persist very long). However,
+ * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
+ * so care is needed. In particular that happens if we try to open a
+ * directory, or of course if there's an actual file-permissions
+ * problem. To distinguish these cases, try a stat(). In the
+ * delete-pending case, it will either also get STATUS_DELETE_PENDING,
+ * or it will see the file as gone and fail with ENOENT. In other
+ * cases it will usually succeed. The only somewhat-likely case where
+ * this coding will uselessly wait is if there's a permissions problem
+ * with a containing directory, which we hope will never happen in any
+ * performance-critical code paths.
+ */
+ if (err == ERROR_ACCESS_DENIED)
+ {
+ struct microsoft_native_stat st;
+
+ if (microsoft_native_stat(name, &st) != 0)
+ {
+ pg_usleep(100000);
+ loops++;
+ continue;
+ }
+ }
+
_dosmaperr(err);
return -1;
}
