Summary: * At the very least a documentation change is needed on the subject of the errno value from disk i/o errors. * Ticket #2398 (on the subject of EINTR) should probably be reopened.
I have recently had an apparently isolated failure of a program making some updates to a sqlite database. The only information I have is this error message: DBD::SQLite::st execute failed: disk I/O error(10) at dbdimp.c line 423 [for Statement "SELECT * FROM sell NATURAL LEFT JOIN commods WHERE commodname IS NULL"] at CommodsDatabase.pm line 158. PROCESSING FAILED CommodsDatabase.pm is my code. That part is doing a consistency check before saying COMMIT. I don't know exactly what sqlite was doing, but I was alarmed. I checked my system logs and there are no reports of problems with the disks. There are no reports of the filesystem having been full and while possible it doesn't seem likely. Unfortunately I'm at a dead-end investigating this because of the lack of an errno value or other detail from sqlite. To know that a system call fail is not by itself sufficient; we need to know the errno value at the very least (knowing which system call and on what kind of object would be nice but is much less important). If SQLITE_IOERROR means that a system call failed, it is in my view essential that either the errno value is somehow incorporated in sqlite's response, or the documentation makes it clear that the calling application must also report errno. Additionally, sqlite must ensure that any subsequent syscalls it makes before returning to the caller don't overwrite the value of errno (or other equivalent on other platforms), for example by saving and restoring it. This and a related matter of an apparently-buggy check for disk full are the subject of Ticket #3107. If the problem wasn't disk full then the only other candidate cause that I found in my searches was Ticket #2398, regarding sqlite's handling of syscalls which return EINTR. The submitter of #2398 is entirely correct. Many of the comments have misunderstood the issue, muddied the waters, and so on. SA_RESTART is a red herring, because sqlite is not entitled to assume that the calling application has set it. I'm not wholly convinced that this is the cause of my problem, but the bug reported in #2398 would definitely produce that the kind of apparently random lossage, if my program happened to get a signal at the wrong moment. Ian. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users