I've been picking away at this for the last few days and have it narrowed down 
fairly well.

It looks like if I turn off shared cache, it works fine (same application code).

If I run with SQL_DEBUG enabled, the first issue I run into in an assertion in 
sqlite3BtreeEnter: assert( sqlite3_mutex_held(p->db->mutex) );
The call stack from it is 

sqlite3BackupUpdate
backupOnePage
sqlite3BtreeGetReserve(p->pSrc)
sqlite3BtreeEnter

Look up the stack, it looks like sqlite3BackupUpdate locks the mutex on the 
destination database but not the source.

Tried as a test adding locking the source db, bad results.
Altered the definition of asserts to make them not fatal, got a ton of 
assertions then deadlocking again.

Haven't tried to make a sample program yet, but the gist of it would be to have 
one (or more threads) doing lots of small transactions updating the database 
while simultaneously having another thread continuously making a backup of the 
db (unrealistic scenario, just makes the race easier to see).

It may or may not matter whether or not encryption is used, or more importantly 
whether SQLITE_HAS_CODEC is defined, since the portion of code that's asserting 
is only there when SQLITE_HAS_CODEC is defined.

At this point, I guess I'll just run without enabling shared cache, which seems 
to work just fine (a little better with regards to backups actually) and just 
hope this gets fixed in a future release.

Jon



It looks like it's unhappy that the mutex for the source database in the  

On Aug 25, 2012, at 1:33 PM, Jonathan Engle wrote:

> No, the deadlock is deeper than that, it's stuck trying to lock mutexes.  My 
> current theory is that the thread trying to update the page in the backup 
> destination database is what's causing trouble.
> 
> I also forgot to mention, each thread is using a different connection object 
> and that it's using shared cache mode.
> 
> Jon
> On Aug 25, 2012, at 12:57 PM, Patrik Nilsson wrote:
> 
>> Do you test for the backup errors, i.e. SQLITE_BUSY and SQLITE_LOCKED?
>> 
>> Do you test for step errors, i.e.  SQLITE_BUSY?
>> 
>> If you get the busy error, you can wait a while and try again or start over.
>> 
>> /Patrik
>> 
>> On 08/24/2012 05:46 PM, Jonathan Engle wrote:
>>> Ran into this recently, it's happened on one machine running a beta test of 
>>> our software.  This is a multi-threaded application, and I've run into a 
>>> sequence of steps that deadlocks hard that as far as I can tell from the 
>>> documentation shouldn't. 
>>> This is using SQLite 3.7.13 with SEE.
>>> The source database is using WAL mode, all transactions are done as 
>>> IMMEDIATE, synchronous mode is set to 0, and it is encrypted.
>>> The destination database for the backup is not encrypted, and is default 
>>> (non-WAL, full synchronous) modes.
>>> 
>>> 
>>> There are multiple threads active:
>>> 
>>> - one performing a write
>>> - two performing reads
>>> - one closing a connection
>>> - one is in the middle of a backup operation
>>> 
>>> Here are the call stacks for the threads:
>>> 
>>> 
>>> Writing thread:
>>> 
>>> sqlite3_step
>>> sqlite3VdbeExec
>>> sqlite3VdbeHalt      
>>> sqlite3BtreeCommitPhaseOne      
>>> sqlite3PagerCommitPhaseOne      
>>> pagerWalFrames      
>>> sqlite3BackupUpdate      
>>> backupOnePage      
>>> sqlite3BtreeEnter      
>>> lockBtreeMutex      
>>> pthread_mutex_lock      
>>> __psynch_mutexwait      
>>> 
>>> Closing a connection thread:
>>> 
>>> sqlite3_close      
>>> sqlite3BtreeEnterAll      
>>> sqlite3BtreeEnter      
>>> lockBtreeMutex      
>>> pthread_mutex_lock      
>>> __psynch_mutexwait      
>>> 
>>> Reading thread:
>>>     
>>> sqlite3_step      
>>> sqlite3VdbeExec      
>>> sqlite3VdbeEnter      
>>> sqlite3BtreeEnter      
>>> lockBtreeMutex      
>>> pthread_mutex_lock      
>>> __psynch_mutexwait      
>>> 
>>> Backing up thread:
>>>     
>>> sqlite3_backup_step      
>>> sqlite3BtreeEnter      
>>> lockBtreeMutex      
>>> pthread_mutex_lock      
>>> __psynch_mutexwait      
>>>     
>>> Reading thread:
>>> 
>>> sqlite3_step      
>>> sqlite3VdbeExec      
>>> sqlite3VdbeEnter      
>>> sqlite3BtreeEnter      
>>> lockBtreeMutex      
>>> pthread_mutex_lock      
>>> __psynch_mutexwait
>>> 
>>> 
>>> 
>>> Also, the destination database for the backup is created on the stack by 
>>> the the thread doing the backup and is never passed out to anybody 
>>> (explicitly).
>>> 
>>> What looks like is happening to me is that the writing and backing-up 
>>> thread are deadlocking with each other, with 'sqlite3BackupUpdate' 
>>> attempting to update the backup destination database.  Unfortunately, this 
>>> is not something I've reproduced locally, so I can't look parameters or 
>>> lock states.  I'm going to try, as a kind of hail-mary, putting a BEGIN 
>>> IMMEDIATE transactions around the backup to block writing during the 
>>> database backup.
>>> 
>>> If anyone has any suggestions or ideas about what I might be doing wrong 
>>> here, I'd appreciate it.
>>> 
>>> 
>>> _______________________________________________
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>> 
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to