On 02/07/2012 05:27 PM, Richard Hipp wrote:
On Tue, Feb 7, 2012 at 8:13 PM, David Barrett<dbarr...@expensify.com>wrote:
My best guess still is that the command-line shell is somehow not seeing
the shared-memory file that is created when the database is running in WAL
mode.  Or, perhaps the posix advisory locks are not working on that
shared-memory file.  The shared-memory file is the one with the "-shm"
suffix.  Of course, the command-line tool and the server process must both
be running on the same machine or else they won't be able to see each
others shared memory, and stuff like this will happen.  You are running
both processes on the same machine, right?

Correct: same machine, same non-networked disk. ext2 file system. I think your theory is a good one, I'm just not sure how to solve it. We've seen it for a long time (years now), but I agree upgrading to the latest version might get luck. Bringing in Simon's questions:


Is your shared access to the database multi-thread, multi-process, or 
multi-computer ?  If multi-computer, how is each computer accessing the 
database file: which sort of network access are you using ?  If there's a local 
computer, what format is the hard disk formatted in ?

Multi-threaded (main thread and a checkpointing thread). The server is single-process, but the client operates in a separate process so I guess this particular issue is in fact "multi-process". It's only multi-computer insofar as we have a replication layer hovering high above.


The result codes indicating 'malformed database' are transient, right ?  After 
getting that result code, you can try the operation again immediately and 
perhaps not get that result code ?

The result only happens when we access via the client -- the API never gives this issue. But yes, it's transient in that sometimes when you re-run the query it works. My theory is it works if perhaps the server doesn't change the database at the same time, which is increasingly infrequent.


Have you ever seen them when you have only one thread, one process working on 
the database ?

No, this only happens when accessing the database with the command line client while it is actively being accessed by the server.


Do they happen when you have more than one thread/process reading the database ?

Uncertain. I don't know precisely what the server is doing at the exact moment that the command line app accesses the database.


Do they happen when you have more than one thread/process writing the database ?

We only have a writer thread and a checkpointing thread. We checkpoint every 5s; not sure if this affects it.


Are you, perhaps, using -DSQLITE_SHM_DIRECTORY when compiling the servers,
but failing to use the same -D when compiling the command-line tool?

This is possible -- we haven't been hand-compiling the sqlite3 command-line binary at the same rate that we've been upgrading the library. (I didn't realize this was necessary: I thought old binaries were compatible with newer databases? Ah, but perhaps not when they're actively being accessed by newer code?) When we upgrade to the latest version, we'll hand-compile the binary to the same version and try again.

Thanks!

-david
Founder and CEO of Expensify
Follow us at http://twitter.com/expensify
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to