Summary:
  * At the very least a documentation change is needed on the subject
    of the errno value from disk i/o errors.
  * Ticket #2398 (on the subject of EINTR) should probably be reopened.


I have recently had an apparently isolated failure of a program making
some updates to a sqlite database.  The only information I have is
this error message:

  DBD::SQLite::st execute failed: disk I/O error(10) at dbdimp.c line 423 [for 
Statement "SELECT * FROM sell NATURAL LEFT JOIN commods WHERE commodname IS 
NULL"] at CommodsDatabase.pm line 158.
  PROCESSING FAILED

CommodsDatabase.pm is my code.  That part is doing a consistency check
before saying COMMIT.  I don't know exactly what sqlite was doing, but
I was alarmed.  I checked my system logs and there are no reports of
problems with the disks.  There are no reports of the filesystem
having been full and while possible it doesn't seem likely.

Unfortunately I'm at a dead-end investigating this because of the lack
of an errno value or other detail from sqlite.  To know that a system
call fail is not by itself sufficient; we need to know the errno value
at the very least (knowing which system call and on what kind of
object would be nice but is much less important).

If SQLITE_IOERROR means that a system call failed, it is in my view
essential that either the errno value is somehow incorporated in
sqlite's response, or the documentation makes it clear that the
calling application must also report errno.

Additionally, sqlite must ensure that any subsequent syscalls it makes
before returning to the caller don't overwrite the value of errno (or
other equivalent on other platforms), for example by saving and
restoring it.

This and a related matter of an apparently-buggy check for disk full
are the subject of Ticket #3107.


If the problem wasn't disk full then the only other candidate cause
that I found in my searches was Ticket #2398, regarding sqlite's
handling of syscalls which return EINTR.  The submitter of #2398 is
entirely correct.

Many of the comments have misunderstood the issue, muddied the waters,
and so on.  SA_RESTART is a red herring, because sqlite is not
entitled to assume that the calling application has set it.

I'm not wholly convinced that this is the cause of my problem, but the
bug reported in #2398 would definitely produce that the kind of
apparently random lossage, if my program happened to get a signal at
the wrong moment.


Ian.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to