Thanks to the efforts of Harry and Henry, I think we have a lead on this problem which has been reported on and off on Windows where the dataset gets truncated and mysterious write errors on the temporary files are reported. It would seem that the bug has the potential to cause problems on systems other than Windows too.
I believe the cause is the optimisation in src/libpspp/ext-array.c which contains this code: static bool do_seek (const struct ext_array *ea_, off_t offset) { struct ext_array *ea = CONST_CAST (struct ext_array *, ea_); if (!ext_array_error (ea)) { if (ea->position == offset) return true; else if (fseeko (ea->file, offset, SEEK_SET) == 0) { ea->position = offset; return true; } else error (0, errno, _("seeking in temporary file")); } return false; } The lines: if (ea->position == offset) return true; avoid performing a seek if the destination of the potential seek happens to be the current position (which would be the most common case). This is ok, except when the current operation is a write and the previous one a read (or vici-versa). The posix spec says: When a file is opened with update mode ( '+' as the second or third character in the mode argument), both input and output may be performed on the associated stream. However, the application shall ensure that output is not directly followed by input without an intervening call to fflush() or to a file positioning function ( fseek(), fsetpos(), or rewind()), and input is not directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of-file. [http://pubs.opengroup.org/onlinepubs/009695399/functions/fopen.html] By avoiding the seek, we are violating this condition. The Microsoft documentation basically says the same: When the "r+", "w+", or "a+" access type is specified, both reading and writing are allowed (the file is said to be open for "update"). However, when you switch between reading and writing, there must be an intervening fflush, fsetpos, fseek, or rewind operation. The current position can be specified for the fsetpos or fseek operation, if desired. [http://msdn.microsoft.com/en-us/library/yeby3zcb%28v=vs.80%29.aspx] Interestingly, until quite recently, the GNU libc documentation, after mentioning that this requirement exists in the ANSI standard, then had the sentance: The GNU C library does not have this limitation; you can do arbitrary reading and writing operations on a stream in whatever order. But this sentance has recently been deleted, and the bug report which gave rise to its deletion suggests that it was and had for a long time been erroneous: [http://pubs.opengroup.org/onlinepubs/009695399/functions/fopen.html] So it would seem that it is unsafe (even on GNU/Linux) not to seek (or flush) before switching between reading and writing. I sent Harry a patch which basically disabled this optimisation completely, and Henry's report suggested that this fixed the problem, but caused the operation to take a lot longer to run (not suprising). I suggest that the correct fix should involve a flag in the ext_array struct which records the direction of the most recent operation (read or write) and ensures that the seek is always done if the direction has changed. J' -- PGP Public key ID: 1024D/2DE827B3 fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3 See http://keys.gnupg.net or any PGP keyserver for public key.
signature.asc
Description: Digital signature
_______________________________________________ pspp-dev mailing list pspp-dev@gnu.org https://lists.gnu.org/mailman/listinfo/pspp-dev