Re: Incorrect errno used with %m for backend code
On Sun, Jun 24, 2018 at 09:23:50PM +0900, Michael Paquier wrote: > On Sun, Jun 24, 2018 at 09:22:01AM +0530, Ashutosh Sharma wrote: >> Okay, I too had a quick look into the source code to see if there are >> still some places where we could have missed to set an errno to ENOSPC >> in case of write system call failure but, couldn't find any such place >> in the code. The v2 version of patch looks good to me. > > Thanks for the review. I'll try to wrap that tomorrow with proper > patches for back-branches as things diverge a bit here and there. Pushed down to 9.3. All branches conflicted, and particularly in pg_basebackup some write() calls have been moved around. -- Michael signature.asc Description: PGP signature
Re: Incorrect errno used with %m for backend code
On Sat, Jun 23, 2018 at 6:43 PM, Michael Paquier wrote: > On Fri, Jun 22, 2018 at 03:45:33PM +0530, Ashutosh Sharma wrote: >> Okay, thanks for the confirmation. Few of them are also there in >> origin.c and snapbuild.c files. > > Thanks Ashutosh. I have been reviewing the whole tree and I found more > places where this is missing, like rewriteheap.c, reorderbuffer.c or > pg_basebackup, which gives the attached. > -- Okay, I too had a quick look into the source code to see if there are still some places where we could have missed to set an errno to ENOSPC in case of write system call failure but, couldn't find any such place in the code. The v2 version of patch looks good to me. So, to conclude, now, v2 patch fixes two things - 1) It makes ereport to print a correct error number (the error number that matches with the error message), 2) It sets the errno to ENOSPC (assuming that the problem is no disk space) if write system call fails to set an errno. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com
Re: Incorrect errno used with %m for backend code
On Fri, Jun 22, 2018 at 11:15:53AM -0400, Alvaro Herrera wrote: > I wondered for a bit if it would be a better idea to have > CloseTransientFile restore the existing errno if there is any and close > works fine -- when I noticed that that function itself says that caller > should check errno for close() errors. Most callers seem to do it > correctly, but a few fail to check the return value. Yeah, the API in its current shape is simpler to understand. Once you get used to the errno stanza of course... > A bunch of other places use CloseTransientFile while closing shop after > some other syscall failure, which is what your patch fixes. I didn't > review those; checking for close failure seems pointless. Agreed. > In some cases, we fsync the file and check that return value, then close > the file and do *not* check CloseTransientFile's return value -- > examples are CheckPointLogicalRewriteHeap, heap_xlog_logical_rewrite, > SnapBuildSerialize, SaveSlotToPath, durable_rename. I don't know if > it's reasonable for close() to fail after successfully fsyncing a file; > maybe this is not a problem. I would patch those nonetheless. And writeTimeLineHistory. > be_lo_export() is certainly missing a check: it writes and closes, > without intervening fsync. One problem at the same time if possible :) I think that those adjustments should be a separate patch. -- Michael signature.asc Description: PGP signature
Re: Incorrect errno used with %m for backend code
On Fri, Jun 22, 2018 at 03:45:33PM +0530, Ashutosh Sharma wrote: > Okay, thanks for the confirmation. Few of them are also there in > origin.c and snapbuild.c files. Thanks Ashutosh. I have been reviewing the whole tree and I found more places where this is missing, like rewriteheap.c, reorderbuffer.c or pg_basebackup, which gives the attached. -- Michael diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c index 8d3c861a33..ed7ba181c7 100644 --- a/src/backend/access/heap/rewriteheap.c +++ b/src/backend/access/heap/rewriteheap.c @@ -1168,9 +1168,14 @@ heap_xlog_logical_rewrite(XLogReaderState *r) /* write out tail end of mapping file (again) */ pgstat_report_wait_start(WAIT_EVENT_LOGICAL_REWRITE_MAPPING_WRITE); if (write(fd, data, len) != len) + { + /* if write didn't set errno, assume problem is no disk space */ + if (errno == 0) + errno = ENOSPC; ereport(ERROR, (errcode_for_file_access(), errmsg("could not write to file \"%s\": %m", path))); + } pgstat_report_wait_end(); /* diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 65194db70e..a9ef1b3d73 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1241,12 +1241,17 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings) */ if (fstat(fd, )) { + int save_errno = errno; + CloseTransientFile(fd); if (give_warnings) + { + errno = save_errno; ereport(WARNING, (errcode_for_file_access(), errmsg("could not stat two-phase state file \"%s\": %m", path))); + } return NULL; } @@ -1274,13 +1279,18 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings) pgstat_report_wait_start(WAIT_EVENT_TWOPHASE_FILE_READ); if (read(fd, buf, stat.st_size) != stat.st_size) { + int save_errno = errno; + pgstat_report_wait_end(); CloseTransientFile(fd); if (give_warnings) + { + errno = save_errno; ereport(WARNING, (errcode_for_file_access(), errmsg("could not read two-phase state file \"%s\": %m", path))); + } pfree(buf); return NULL; } @@ -1663,16 +1673,26 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len) pgstat_report_wait_start(WAIT_EVENT_TWOPHASE_FILE_WRITE); if (write(fd, content, len) != len) { + int save_errno = errno; + pgstat_report_wait_end(); CloseTransientFile(fd); + + /* if write didn't set errno, assume problem is no disk space */ + errno = save_errno ? save_errno : ENOSPC; ereport(ERROR, (errcode_for_file_access(), errmsg("could not write two-phase state file: %m"))); } if (write(fd, _crc, sizeof(pg_crc32c)) != sizeof(pg_crc32c)) { + int save_errno = errno; + pgstat_report_wait_end(); CloseTransientFile(fd); + + /* if write didn't set errno, assume problem is no disk space */ + errno = save_errno ? save_errno : ENOSPC; ereport(ERROR, (errcode_for_file_access(), errmsg("could not write two-phase state file: %m"))); @@ -1686,7 +1706,10 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len) pgstat_report_wait_start(WAIT_EVENT_TWOPHASE_FILE_SYNC); if (pg_fsync(fd) != 0) { + int save_errno = errno; + CloseTransientFile(fd); + errno = save_errno; ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync two-phase state file: %m"))); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index adbd6a2126..1a419aa49b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -3268,7 +3268,10 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock) pgstat_report_wait_start(WAIT_EVENT_WAL_INIT_SYNC); if (pg_fsync(fd) != 0) { + int save_errno = errno; + close(fd); + errno = save_errno; ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", tmppath))); @@ -11675,8 +11678,10 @@ retry: if (lseek(readFile, (off_t) readOff, SEEK_SET) < 0) { char fname[MAXFNAMELEN]; + int save_errno = errno; XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size); + errno = save_errno; ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen), (errcode_for_file_access(), errmsg("could not seek in log segment %s to offset %u: %m", @@ -11688,9 +11693,11 @@ retry: if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ) { char fname[MAXFNAMELEN]; + int save_errno = errno; pgstat_report_wait_end(); XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size); + errno = save_errno; ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen), (errcode_for_file_access(), errmsg("could not read from log segment %s, offset %u: %m", diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 52fe55e2af..4ecdc9220f 100644 --- a/src/backend/access/transam/xlogutils.c +++
Re: Incorrect errno used with %m for backend code
Hi, On 2018-06-22 11:15:53 -0400, Alvaro Herrera wrote: > I don't understand the logic in RestoreSlotFromDisk that fsync the file > prior to reading it. What are we protecting against? we could previously have crashed before fsyncing. But we can only rely on slot contents being durable if we fsync them. Therefore we fsync before reading, after a crash. > Anyway, while I think it would be nice to have CloseTransientFile > restore the original errno if there was one and close goes well, it > probably unduly complicates its API contract. Yea, agreed. Greetings, Andres Freund
Re: Incorrect errno used with %m for backend code
On 2018-Jun-22, Michael Paquier wrote: > A couple of places use CloseTransientFile without bothering much that > this can overwrite errno. I was tempted in keeping errno saved and kept > if set to a non-zero value, but refrained from doing so as some callers > may rely on the existing behavior, and the attached shows better the > intention. I wondered for a bit if it would be a better idea to have CloseTransientFile restore the existing errno if there is any and close works fine -- when I noticed that that function itself says that caller should check errno for close() errors. Most callers seem to do it correctly, but a few fail to check the return value. Quite some places open files O_RDONLY, so lack of error checking after closing seems ok. (Unless there's some funny interaction with the fsync fiasco, of course.) A bunch of other places use CloseTransientFile while closing shop after some other syscall failure, which is what your patch fixes. I didn't review those; checking for close failure seems pointless. In some cases, we fsync the file and check that return value, then close the file and do *not* check CloseTransientFile's return value -- examples are CheckPointLogicalRewriteHeap, heap_xlog_logical_rewrite, SnapBuildSerialize, SaveSlotToPath, durable_rename. I don't know if it's reasonable for close() to fail after successfully fsyncing a file; maybe this is not a problem. I would patch those nonetheless. be_lo_export() is certainly missing a check: it writes and closes, without intervening fsync. I don't understand the logic in RestoreSlotFromDisk that fsync the file prior to reading it. What are we protecting against? Anyway, while I think it would be nice to have CloseTransientFile restore the original errno if there was one and close goes well, it probably unduly complicates its API contract. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Incorrect errno used with %m for backend code
On Fri, Jun 22, 2018 at 2:44 PM, Michael Paquier wrote: > On Fri, Jun 22, 2018 at 01:00:45PM +0530, Ashutosh Sharma wrote: >> It seems like in case of few system calls for e.g. write system call, >> errno is not set even if the number of bytes written is smaller than >> the bytes requested and for such cases we explicitly set an errno to >> ENOSPC. Something like this, >> >> /* >> * if write didn't set errno, assume problem is no disk space >> */ >> errno = save_errno ? save_errno : ENOSPC; >> >> Shouldn't we do similar handling in your patch as well or please let >> me know if i am missing something here. > > Yeah, I can see at least three of them in twophase.c. Let's fix those > as well at the same time. Okay, thanks for the confirmation. Few of them are also there in origin.c and snapbuild.c files. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com
Re: Incorrect errno used with %m for backend code
On Fri, Jun 22, 2018 at 01:00:45PM +0530, Ashutosh Sharma wrote: > It seems like in case of few system calls for e.g. write system call, > errno is not set even if the number of bytes written is smaller than > the bytes requested and for such cases we explicitly set an errno to > ENOSPC. Something like this, > > /* > * if write didn't set errno, assume problem is no disk space > */ > errno = save_errno ? save_errno : ENOSPC; > > Shouldn't we do similar handling in your patch as well or please let > me know if i am missing something here. Yeah, I can see at least three of them in twophase.c. Let's fix those as well at the same time. -- Michael signature.asc Description: PGP signature
Re: Incorrect errno used with %m for backend code
Hi, It seems like in case of few system calls for e.g. write system call, errno is not set even if the number of bytes written is smaller than the bytes requested and for such cases we explicitly set an errno to ENOSPC. Something like this, /* * if write didn't set errno, assume problem is no disk space */ errno = save_errno ? save_errno : ENOSPC; Shouldn't we do similar handling in your patch as well or please let me know if i am missing something here. On Fri, Jun 22, 2018 at 11:45 AM, Michael Paquier wrote: > Hi all, > > I have been reviewing the use of errno in the backend code when %m is > used in the logs, and found more places than when I looked at improving > the error messages for read() calls which bloat the errno value because > of intermediate system calls. As the problem is separate and should be > back-patched, I have preferred beginning a new thread. > > A couple of places use CloseTransientFile without bothering much that > this can overwrite errno. I was tempted in keeping errno saved and kept > if set to a non-zero value, but refrained from doing so as some callers > may rely on the existing behavior, and the attached shows better the > intention. > > Attached is a patch with everything I have spotted. Any opinions or > thoughts in getting this back-patched? > > Thanks, > -- > Michael
Incorrect errno used with %m for backend code
Hi all, I have been reviewing the use of errno in the backend code when %m is used in the logs, and found more places than when I looked at improving the error messages for read() calls which bloat the errno value because of intermediate system calls. As the problem is separate and should be back-patched, I have preferred beginning a new thread. A couple of places use CloseTransientFile without bothering much that this can overwrite errno. I was tempted in keeping errno saved and kept if set to a non-zero value, but refrained from doing so as some callers may rely on the existing behavior, and the attached shows better the intention. Attached is a patch with everything I have spotted. Any opinions or thoughts in getting this back-patched? Thanks, -- Michael diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 65194db70e..e5571a67d6 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1241,12 +1241,17 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings) */ if (fstat(fd, )) { + int save_errno = errno; + CloseTransientFile(fd); if (give_warnings) + { + errno = save_errno; ereport(WARNING, (errcode_for_file_access(), errmsg("could not stat two-phase state file \"%s\": %m", path))); + } return NULL; } @@ -1274,13 +1279,18 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings) pgstat_report_wait_start(WAIT_EVENT_TWOPHASE_FILE_READ); if (read(fd, buf, stat.st_size) != stat.st_size) { + int save_errno = errno; + pgstat_report_wait_end(); CloseTransientFile(fd); if (give_warnings) + { + errno = save_errno; ereport(WARNING, (errcode_for_file_access(), errmsg("could not read two-phase state file \"%s\": %m", path))); + } pfree(buf); return NULL; } @@ -1663,16 +1673,22 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len) pgstat_report_wait_start(WAIT_EVENT_TWOPHASE_FILE_WRITE); if (write(fd, content, len) != len) { + int save_errno = errno; + pgstat_report_wait_end(); CloseTransientFile(fd); + errno = save_errno; ereport(ERROR, (errcode_for_file_access(), errmsg("could not write two-phase state file: %m"))); } if (write(fd, _crc, sizeof(pg_crc32c)) != sizeof(pg_crc32c)) { + int save_errno = errno; + pgstat_report_wait_end(); CloseTransientFile(fd); + errno = save_errno; ereport(ERROR, (errcode_for_file_access(), errmsg("could not write two-phase state file: %m"))); @@ -1686,7 +1702,10 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len) pgstat_report_wait_start(WAIT_EVENT_TWOPHASE_FILE_SYNC); if (pg_fsync(fd) != 0) { + int save_errno = errno; + CloseTransientFile(fd); + errno = save_errno; ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync two-phase state file: %m"))); diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index adbd6a2126..1a419aa49b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -3268,7 +3268,10 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock) pgstat_report_wait_start(WAIT_EVENT_WAL_INIT_SYNC); if (pg_fsync(fd) != 0) { + int save_errno = errno; + close(fd); + errno = save_errno; ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", tmppath))); @@ -11675,8 +11678,10 @@ retry: if (lseek(readFile, (off_t) readOff, SEEK_SET) < 0) { char fname[MAXFNAMELEN]; + int save_errno = errno; XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size); + errno = save_errno; ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen), (errcode_for_file_access(), errmsg("could not seek in log segment %s to offset %u: %m", @@ -11688,9 +11693,11 @@ retry: if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ) { char fname[MAXFNAMELEN]; + int save_errno = errno; pgstat_report_wait_end(); XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size); + errno = save_errno; ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen), (errcode_for_file_access(), errmsg("could not read from log segment %s, offset %u: %m", diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 52fe55e2af..4ecdc9220f 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -718,9 +718,10 @@ XLogRead(char *buf, int segsize, TimeLineID tli, XLogRecPtr startptr, if (lseek(sendFile, (off_t) startoff, SEEK_SET) < 0) { char path[MAXPGPATH]; +int save_errno = errno; XLogFilePath(path, tli, sendSegNo, segsize); - +errno = save_errno; ereport(ERROR, (errcode_for_file_access(), errmsg("could not seek in log segment %s to offset %u: %m", @@ -741,9 +742,10