Re: [HACKERS] [PATCH] `pg_dump -Fd` doesn't check write return status...
On Mon, May 05, 2014 at 08:29:56PM -0400, Bruce Momjian wrote: On Tue, Apr 22, 2014 at 03:19:15PM -0400, Bruce Momjian wrote: As is often the case with pg_dump, the problems you saw are a small part of a larger set of problems in that code --- there is general ignoring of read and write errors. I have developed a comprehensive patch that addresses all the issues I could find. The use of function pointers and the calling of functions directly and through function pointers makes the fix quite complex. I ended up placing checks at the lowest level and removing checks at higher layers where they were sporadically placed. Patch attached. I have tested dump/restore of all four dump output formats with the 9.3 regression database and all tests passed. I believe this patch is too complex to backpatch, and I don't know how it could be fixed more simply. Patch applied. Thanks for the report. This broke automatic detection of tar-format dumps: $ pg_dump -Ft -f tardump $ pg_restore tardump pg_restore: [archiver] could not read from input file: Success $ pg_restore -Ft tardump | wc -c 736 Stack trace: #0 vwrite_msg (modulename=0x4166c9 archiver, fmt=0x41aa08 could not read from input file: %s\n, ap=0x7fffde38) at pg_backup_utils.c:85 #1 0x0040f820 in exit_horribly (modulename=0x4166c9 archiver, fmt=0x41aa08 could not read from input file: %s\n) at parallel.c:190 #2 0x0040557d in _discoverArchiveFormat (AH=0x425340) at pg_backup_archiver.c:2040 #3 0x00405756 in _allocAH (FileSpec=0x7fffecbb tardump, fmt=archUnknown, compression=0, mode=archModeRead, setupWorkerPtr=0x4044f0 setupRestoreWorker) at pg_backup_archiver.c:2160 #4 0x00405819 in OpenArchive (FileSpec=optimized out, fmt=optimized out) at pg_backup_archiver.c:148 #5 0x00404331 in main (argc=2, argv=0x7fffea28) at pg_restore.c:375 -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] `pg_dump -Fd` doesn't check write return status...
On Tue, May 6, 2014 at 09:22:20AM -0400, Noah Misch wrote: On Mon, May 05, 2014 at 08:29:56PM -0400, Bruce Momjian wrote: On Tue, Apr 22, 2014 at 03:19:15PM -0400, Bruce Momjian wrote: As is often the case with pg_dump, the problems you saw are a small part of a larger set of problems in that code --- there is general ignoring of read and write errors. I have developed a comprehensive patch that addresses all the issues I could find. The use of function pointers and the calling of functions directly and through function pointers makes the fix quite complex. I ended up placing checks at the lowest level and removing checks at higher layers where they were sporadically placed. Patch attached. I have tested dump/restore of all four dump output formats with the 9.3 regression database and all tests passed. I believe this patch is too complex to backpatch, and I don't know how it could be fixed more simply. Patch applied. Thanks for the report. This broke automatic detection of tar-format dumps: $ pg_dump -Ft -f tardump $ pg_restore tardump pg_restore: [archiver] could not read from input file: Success $ pg_restore -Ft tardump | wc -c 736 OK, thanks for the report. Fixed. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] `pg_dump -Fd` doesn't check write return status...
On Tue, Apr 22, 2014 at 03:19:15PM -0400, Bruce Momjian wrote: On Sat, Mar 1, 2014 at 12:27:19PM -0800, Sean Chittenden wrote: The attached patch fixes the case when `pg_dump -Fd …` is called on a partition where write(2) fails for some reason or another. In this case, backup jobs were returning with a successful exit code even though most of the files in the dump directory were all zero length. I haven’t tested this patch’s failure conditions but the fix seems simple enough: cfwrite() needs to have its return status checked everywhere and exit_horribly() upon any failure. In this case, callers of _WriteData() were not checking the return status and were discarding the negative return status (e.g. ENOSPC). I made a cursory pass over the code and found one other instance where write status wasn’t being checked and also included that. As is often the case with pg_dump, the problems you saw are a small part of a larger set of problems in that code --- there is general ignoring of read and write errors. I have developed a comprehensive patch that addresses all the issues I could find. The use of function pointers and the calling of functions directly and through function pointers makes the fix quite complex. I ended up placing checks at the lowest level and removing checks at higher layers where they were sporadically placed. Patch attached. I have tested dump/restore of all four dump output formats with the 9.3 regression database and all tests passed. I believe this patch is too complex to backpatch, and I don't know how it could be fixed more simply. Patch applied. Thanks for the report. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] `pg_dump -Fd` doesn't check write return status...
On Sat, Mar 1, 2014 at 12:27:19PM -0800, Sean Chittenden wrote: The attached patch fixes the case when `pg_dump -Fd …` is called on a partition where write(2) fails for some reason or another. In this case, backup jobs were returning with a successful exit code even though most of the files in the dump directory were all zero length. I haven’t tested this patch’s failure conditions but the fix seems simple enough: cfwrite() needs to have its return status checked everywhere and exit_horribly() upon any failure. In this case, callers of _WriteData() were not checking the return status and were discarding the negative return status (e.g. ENOSPC). I made a cursory pass over the code and found one other instance where write status wasn’t being checked and also included that. As is often the case with pg_dump, the problems you saw are a small part of a larger set of problems in that code --- there is general ignoring of read and write errors. I have developed a comprehensive patch that addresses all the issues I could find. The use of function pointers and the calling of functions directly and through function pointers makes the fix quite complex. I ended up placing checks at the lowest level and removing checks at higher layers where they were sporadically placed. Patch attached. I have tested dump/restore of all four dump output formats with the 9.3 regression database and all tests passed. I believe this patch is too complex to backpatch, and I don't know how it could be fixed more simply. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c new file mode 100644 index 10bc3f0..bad21b5 *** a/src/bin/pg_dump/compress_io.c --- b/src/bin/pg_dump/compress_io.c *** static void InitCompressorZlib(Compresso *** 86,99 static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush); static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF); ! static size_t WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs, const char *data, size_t dLen); static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs); #endif /* Routines that support uncompressed data I/O */ static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF); ! static size_t WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs, const char *data, size_t dLen); /* --- 86,99 static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush); static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF); ! static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs, const char *data, size_t dLen); static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs); #endif /* Routines that support uncompressed data I/O */ static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF); ! static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs, const char *data, size_t dLen); /* *** ReadDataFromArchive(ArchiveHandle *AH, i *** 179,185 /* * Compress and write data to the output stream (via writeF). */ ! size_t WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs, const void *data, size_t dLen) { --- 179,185 /* * Compress and write data to the output stream (via writeF). */ ! void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs, const void *data, size_t dLen) { *** WriteDataToArchive(ArchiveHandle *AH, Co *** 190,203 { case COMPR_ALG_LIBZ: #ifdef HAVE_LIBZ ! return WriteDataToArchiveZlib(AH, cs, data, dLen); #else exit_horribly(modulename, not built with zlib support\n); #endif case COMPR_ALG_NONE: ! return WriteDataToArchiveNone(AH, cs, data, dLen); } ! return 0; /* keep compiler quiet */ } /* --- 190,205 { case COMPR_ALG_LIBZ: #ifdef HAVE_LIBZ ! WriteDataToArchiveZlib(AH, cs, data, dLen); #else exit_horribly(modulename, not built with zlib support\n); #endif + break; case COMPR_ALG_NONE: ! WriteDataToArchiveNone(AH, cs, data, dLen); ! break; } ! return; } /* *** DeflateCompressorZlib(ArchiveHandle *AH, *** 298,307 */ size_t len = cs-zlibOutSize - zp-avail_out; ! if (cs-writeF(AH, out, len) != len) ! exit_horribly(modulename, ! could not write to output file: %s\n, ! strerror(errno)); } zp-next_out = (void *) out; zp-avail_out = cs-zlibOutSize; --- 300,306 */ size_t len = cs-zlibOutSize - zp-avail_out; ! cs-writeF(AH, out, len); } zp-next_out = (void *) out; zp-avail_out = cs-zlibOutSize;
[HACKERS] [PATCH] `pg_dump -Fd` doesn't check write return status...
The attached patch fixes the case when `pg_dump -Fd …` is called on a partition where write(2) fails for some reason or another. In this case, backup jobs were returning with a successful exit code even though most of the files in the dump directory were all zero length. I haven’t tested this patch’s failure conditions but the fix seems simple enough: cfwrite() needs to have its return status checked everywhere and exit_horribly() upon any failure. In this case, callers of _WriteData() were not checking the return status and were discarding the negative return status (e.g. ENOSPC). I made a cursory pass over the code and found one other instance where write status wasn’t being checked and also included that. -sc pg_dump_check_write.patch Description: Binary data -- Sean Chittenden s...@chittenden.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers