[sqlite] recurrent failure mode
> >You might want to read my message on the topic from the list archives, >dated Sat, 31 Jan 2015. In this case, no concurrent or multiple users are involved. It's just one client and the database. There's still plenty of scope for the networked file system to do things that make sqlite fail.
[sqlite] recurrent failure mode
On 26 Feb 2015, at 12:26am, Dave Dyer wrote: >> Do you have any multi-access things going on ? Two or more computers, >> applications, processes or threads trying to access the database at the same >> time ? > > No, but it would be normal for the database to be on a different > computer than the sqlite client, and be using whatever networked > file system is common. The culprit clients seem to be macs, we're > still seeking more information about the specifics. Sorry to have to tell you, but almost no implementations of network file systems implement locking properly. So any situations involving simultaneous writing can lead to corruption. But if you do not have simultaneous writing, there's no good reason for database corruption. What you need to find out next is OS version that the computer holding the file is running. File system that the disk the file is on is formatted in. OS version that the computer accessing the file is running. Network protocol used to access the remote filespace (SMB or AFP or something like that). Earlier you wrote > I suppose that this might be a sqlite bug if the "insert records" step > and the "maintain indexes" step were separated by a disk error and the > rollback of the failed transaction was incomplete. Now you've told me that you are checking the result codes returned by all calls, this would have to be a significant bug in SQLite and I don't think I've heard of any other reports of it. Simon.
[sqlite] recurrent failure mode
On 25 Feb 2015, at 8:23pm, Dave Dyer wrote: > The facile explanation would be that a transaction to insert a new > record was executed twice, but the indexes were incorrectly maintained. > > INSERT INTO "preference_table" VALUES('Picture Placer-707-1304b-19-Maranda > Richardson','scrollPos','0'); Can you provide the schema (the CREATE TABLE and any CREATE INDEX commands) for that table ? Do you have any multi-access things going on ? Two or more computers, applications, processes or threads trying to access the database at the same time ? Does your application check the result code returned from all sqlite3_ calls ? Not just the ones inserting rows, but also those opening and closing the file, setting PRAGMAs, or anything else it does with the sqlite3_ library ? Simon.
[sqlite] recurrent failure mode
On Wed, 25 Feb 2015 16:26:45 -0800 Dave Dyer wrote: > >Do you have any multi-access things going on ? Two or more > >computers, applications, processes or threads trying to access the > >database at the same time ? > > No, but it would be normal for the database to be on a different > computer than the sqlite client, and be using whatever networked > file system is common. The culprit clients seem to be macs, we're > still seeking more information about the specifics. You might want to read my message on the topic from the list archives, dated Sat, 31 Jan 2015. Although "bugs" are frequently blamed, in fact the semantics of networked filesystems are different from that of local filesystems. Making a database work on a network filesystem might be possible, but requires considerable work to support cache coherency. Expecting SQLite to do that is not too different from expecting to read a letter the moment it's dropped in the mailbox across town. Just consider this scenario: two clients open the same database, where the file is on a fileserver somewhere on the network. Each one does, say, "select * from T" and peruses the data. Then, client A inserts record 17: lock record insert row free lock All good. Now, client B inserts record 17: lock record insert row free lock Should result in a primary key violation, right? No. Client A has updated his *local* cache, his in-memory image of some part of the database. That's not a SQLite cache; that's the kernel's filebuffer cache representing the state of some part of the filesystem. It's completely correct from A's point of view. SQLite depends on that cache being correct, and it is: from A's point of view. If B is on the same machine as A, they share a single, kernel-provided filebuffer cache, and when B attempt to insert the duplicate record, SQLite will see A's record and reject the insert. N machines is N caches. When A flushes his cache with sync(2), how long before B learns of the change? With N=1 (same machine), B learns instantaneously. With N > 1? NFS promises only that B will see A's changes after closing and reopening the file. When B is on a different machine, the local representation of the state of the filesystem does not include A's update. SQLite examines the "file", sees no record 17, and updates its local image. When the kernel eventually flushes B's update, A's local cache becomes stale. With some "luck", you can actually go on like this for a while with no one noticing. As long as different clients are updating different parts of the database and fortuitously refereshing their caches (by re-reading updated parts that the network isn't helpfully caching, too), it can seem to sort of work. Failure is guaranteed in the most important scenario: when SQLite requires a coherent cache and doesn't have one. So it really doesn't matter if the locking mechanism on network filesystems are perfect. By *design*, they make weaker promises than Posix. Expecting them to do something they're documented not to do is asking for trouble. Which is to say, unfortunately, that you got what you asked for. :-/ HTH. --jkl
[sqlite] recurrent failure mode
> >Can you provide the schema (the CREATE TABLE and any CREATE INDEX commands) >for that table ? CREATE TABLE preference_table ( preferenceSet text,/* name of this preference group */ preferenceName text, /* a preference in this group */ preferenceValue text /* sort order of this k...; CREATE UNIQUE INDEX preferenceindex on preference_table(preferenceSet,preferenceName); >Do you have any multi-access things going on ? Two or more computers, >applications, processes or threads trying to access the database at the same >time ? No, but it would be normal for the database to be on a different computer than the sqlite client, and be using whatever networked file system is common. The culprit clients seem to be macs, we're still seeking more information about the specifics. >Does your application check the result code returned from all sqlite3_ calls ? > Not just the ones inserting rows, but also those opening and closing the >file, setting PRAGMAs, or anything else it does with the sqlite3_ library ? Yes. It all goes through a common interface function which is careful about checking. As I said in the original message, this is something that has been working without problems for a few years, the only thing that's changing is the network and OS environment it's deployed in. My hypothesis is that a new failure mode in the file system is tickling a sqlite bug. Based on the evidence available now, a transaction that is trying to insert 4 records fails, and is retried, resulting in 8 records which can't be indexed.
[sqlite] recurrent failure mode
We're experiencing a new, recurrent failure mode in an old (ie; not recently changed) sqlite application. This may be associated with buggy networked file system implementations (thanks to apple and/or microsoft) The apparent problem is that indexes on a small table become corrupted by not being unique. Except for the non-uniqueness of the index keys, there's no apparent damage. The facile explanation would be that a transaction to insert a new record was executed twice, but the indexes were incorrectly maintained. INSERT INTO "preference_table" VALUES('Picture Placer-707-1304b-19-Maranda Richardson','scrollPos','0'); INSERT INTO "preference_table" VALUES('Picture Placer-707-1304b-19-Maranda Richardson','nFill','0'); INSERT INTO "preference_table" VALUES('Picture Placer-707-1304b-19-Maranda Richardson','placeInBW','0'); INSERT INTO "preference_table" VALUES('Picture Placer-707-1304b-19-Maranda Richardson','DB_Subset',''); I suppose that this might be a sqlite bug if the "insert records" step and the "maintain indexes" step were separated by a disk error and the rollback of the failed transaction was incomplete.
[sqlite] recurrent failure mode
We're experiencing a new, recurrent failure mode in an old (ie; not recently changed) sqlite application. This may be associated with buggy networked file system implementations (thanks to apple and/or microsoft) The apparent problem is that indexes on a small table become corrupted by not being unique. Except for the non-uniqueness of the index keys, there's no apparent damage. The facile explanation would be that a transaction to insert a new record was executed twice, but the indexes were incorrectly maintained. INSERT INTO "preference_table" VALUES('Picture Placer-707-1304b-19-Maranda Richardson','scrollPos','0'); INSERT INTO "preference_table" VALUES('Picture Placer-707-1304b-19-Maranda Richardson','nFill','0'); INSERT INTO "preference_table" VALUES('Picture Placer-707-1304b-19-Maranda Richardson','placeInBW','0'); INSERT INTO "preference_table" VALUES('Picture Placer-707-1304b-19-Maranda Richardson','DB_Subset',''); I suppose that this might be a sqlite bug if the "insert records" step and the "maintain indexes" step were separated by a disk error and the rollback of the failed transaction was incomplete.